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Abstract 


We  examine  the  determinants  of  research  productivity  in  the  pharmaceutical  industry 
using  detailed  internal  firm  data.  Larger  research  efforts  are  more  productive,  not  only 
because  they  enjoy  conventional  economies  of  scale,  but  also  because  they  realize 
economies  of  scope  by  sustaining  an  adequately  diverse  portfolio  of  research  projects  and 
by  capturing  internal  and  external  spillovers  of  knowledge.  This  ability  has  become  more 
important  as  the  industry  has  moved  from  "random"  to  "rational"  techniques  of  drug 
discovery.  We  also  find  surprisingly  large  and  persistent  heterogeneities  among  firms, 
both  in  research  productivity  and  in  the  structure  of  their  research  portfolios. 
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Introduction 

Economies  of  scale  and  scope  in  the  production  of  knowledge  play  an  important  role  in  the  theory 
of  the  firm,  and  the  belief  that  there  are  significant  spillovers  of  knowledge  across  firms  has  assumed  a 
central  role  in  new  theories  of  economic  growth.  But  although  a  considerable  body  of  historical  evidence 
is  consistent  with  the  existence  of  scale  and  scope  economies  in  research,  econometric  efforts  to  document 
their  importance  have  generated  surprisingly  inconclusive  and  sometimes  contradictory  results.  Evidence 
for  the  existence  of  spillovers,  although  quite  compelling,  is  also  unsatisfactorily  incomplete  given  the 
importance  of  their  role  in  modern  theory. 

In  this  paper  we  look  inside  the  firm  for  evidence  as  to  the  importance  of  scale,  scope  and 
spillovers  in  drug  discovery,  using  detailed  data  on  individual  research  programs  obtained  from  the 
internal  records  of  ten  major  research-oriented  pharmaceutical  companies.  This  narrow  focus  mitigates 
some  of  the  problems  that  may  be  responsible  for  the  inconclusive  nature  of  the  existing  empirical 
literature  on  firm  size  and  R&D.  Rather  than  drawing  inferences  from  the  behavior  of  firm  aggregates, 
we  have  direct  measurements  of  the  phenomena  of  interest  -  the  size  and  structure  of  firms'  portfolios 
of  research  projects  -  allowing  us  to  distinguish  between  economies  of  scope  and  economies  of  scale, 
and  we  are  able  to  control  quite  precisely  for  cross-sectional  variation  in  appropriability  and  technological 
opportunity  conditions.  The  pharmaceutical  industry  presents  a  particularly  interesting  arena  in  which  to 
study  these  issues  since  not  only  is  it  one  of  the  most  research-intensive,  innovative,  and  closely 
scrutinized  industries,  but  it  has  also  recently  undergone  a  dramatic  shift  from  so  called  "random"  to 
"rational"  drug  discovery.  This  shift  appears  to  have  changed  the  nature  of  the  returns  to  scope  and  scale 
in  research  with  potentially  important  consequences  for  the  competitive  dynamics  of  the  industry. 

Our  results  suggest  that  there  are  significant  returns  to  size  in  pharmaceutical  research,  but  that 
only  a  small  portion  of  these  returns  are  derived  from  conventional  economies  of  scale.  The  primary 
advantage  of  large  firms  appears  to  be  their  ability  to  realize  economies  of  scope:  to  sustain  an  adequately 
diverse  portfolio  of  research  projects,  and  to  make  use  of  internal  and  external  spillovers  of  knowledge. 
This  may  explain,  for  example,  why  the  majority  of  firms  in  our  sample  sustain  a  surprisingly  large 
number  of  relatively  small  and  seemingly  "unproductive"  research  programs. 

But  while  the  size  and  scope  of  the  research  portfolio  have  significant  implications  for  research 
productivity,  in  these  data,  at  least,  most  of  the  observed  variation  in  the  productivity  of  individual 
research  programs  is  accounted  for  by  fixed  firm  and  therapeutic  class  effects,  and  by  accumulated 
"knowledge  capital" :  ex  post,  the  most  productive  research  programs  are  those  that  are  in  the  "right"  area 
and  the  "right"  firm  and  that  have  been  successful  in  the  past.  Moreover,  despite  the  evidence  we  find 
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for  significant  relationships  between  the  diversity  and  "focus"  of  the  research  portfolio  and  its 
productivity,  the  firms  in  our  sample  maintained  marked  differences  in  these  dimensions  over  extended 
periods  of  time.  We  observe  similar  heterogeneity  in  our  firms'  responses  to  the  industry's  shift  from 
"random"  to  "rational"  techniques  of  drug  discovery,  which  appears  to  have  significantly  increased  the 
costs  of  pharmaceutical  research,  and  to  have  altered  the  relative  returns  to  scale,  scope,  and  capturing 
spillovers.  Thus  our  results  raise  a  number  of  intriguing  questions  for  further  research. 

The  paper  begins  with  a  short  literature  review  as  background  to  the  development  of  our 
hypotheses.  Subsequent  sections  discuss  some  of  the  estimation  issues  and  describe  the  data  on  which  the 
study  is  based.  Section  (4)  outlines  the  empirical  results,  and  the  paper  closes  with  a  brief  discussion  of 
their  implications. 

Hypothesis  Development  and  Literature  Review. 

Scale,  scope,  spillovers  and  research  productivity. 

In  principle  size  confers  three  major  advantages  in  performing  R&D  (Fisher  &  Temin,  1973; 
Panzar  and  Willig,  1981;  Cohen  and  Levin,  1989;  Schumpeter  1934,  1950).  Firstly,  in  the  absence  of 
fully  functioning  markets  for  innovation,  larger  firms  may  be  able  to  spread  the  fixed  costs  of  research 
over  a  larger  sales  base.  Secondly,  large  firms  may  also  have  advantages  in  the  financial  markets  over 
smaller  firms:  to  the  degree  that  they  are  able  to  mitigate  problems  of  adverse  selection  and  moral  hazard 
in  raising  capital,  they  may  be  better  positioned  to  fund  risky  projects.  Lastly,  larger  firms  may  also  be 
able  to  exploit  complementarities  within  the  firm  to  increase  the  productivity  of  research,  where 
complementarities  may  exist  both  between  research  projects  within  any  given  research  effort,  and  between 
other  functions  of  the  firm  such  as  marketing  and  manufacturing. 

These  effects  are  of  long  standing  theoretical  interest  (Panzar,  1989).  Economies  of  scale  have 
long  been  thought  to  play  a  central  role  in  the  determination  of  industry  structure,  and  economies  of  scope 
in  R&D  play  an  important  role  in  the  theory  of  the  firm.  As  Teece  (1980)  and  Cohen  and  Levinthal 
(1989)  have  pointed  out,  since  there  are  several  well  known  problems  in  the  market  for  information 
(Arrow,  1962),  economies  of  scope  in  research  and  development  are  likely  to  be  a  particularly  powerful 
rationale  for  the  existence  of  multiproduct  firms.  Externalities  arising  from  the  public  goods  aspect  of 
knowledge  have  also  assumed  a  central  role  in  modern  growth  theory  (Romer,  1986;  Grossman  and 
Helpman,  1991),  and  R&D  spillovers  also  play  a  key  role  in  many  models  of  market  structure  (Spence, 
1984;  Dasgupta  and  Stiglitz,  1980). 

However  despite  the  theoretical  importance  of  this  topic,  systematic  empirical  tests  of  the 


relationship  between  firm  size  and  research  productivity  have  been  largely  inconclusive.  Although  much 
qualitative  evidence  is  consistent  with  the  belief  that  size  confers  significant  advantages,  lack  of 
appropriately  detailed  data  on  R&D  activity  within  the  firm  has  made  it  difficult  to  draw  conclusions 
about  the  presence  of  economies  of  scale  in  performing  R&D,  or  to  distinguish  between  economies  of 
scale  and  economies  of  scope  (Freeman,  1982;  Chandler,  1990;  Mowery,  1989). 

Baldwin  and  Scott  (1987),  and  Cohen  and  Levin  (1989),  survey  the  contradictory  results  of  the 
extensive  econometric  literature  which  has  attempted  to  establish  a  positive  relationship  between  firm  size 
and  R&D  intensity  —  a  stream  of  research  which,  as  Fisher  and  Temin  (1973)  have  pointed  out,  cannot 
in  any  case  be  taken  as  tests  of  these  ideas.  Unfortunately  the  results  of  later  work  exploring  the  more 
appropriate  relationship  between  research  output  and  firm  size  have  been  similarly  unconvincing.  Bound 
et  al.  (1984)  found  that  there  was  evidence  of  constant  returns  to  scale  for  R&D  programs  between  $2m 
and  $100m  when  patents  were  used  as  a  measure  of  research  output,  but  their  results  were  quite  sensitive 
to  specification  assumptions.  Acs  and  Audretsch  (1988),  using  a  measure  of  "major  innovations*  as  a 
measure  of  output  found  that  in  highly  concentrated  industries  with  high  barriers  to  entry  large  firms  were 
likely  to  be  the  source  of  the  majority  of  innovations,  while  in  less  concentrated,  less  mature  industries 
smaller  firms  were  likely  to  be  more  innovative.  A  study  by  Pavitt  et  al.  (1987),  using  a  similar  measure 
of  output,  suggested  that  both  very  small  firms  and  very  large  firms  were  proportionately  more  innovative 
than  more  moderate  sized  firms. 

Similarly,  although  the  presence  of  economies  of  scope  in  production  have  been  verified  in  many 
industries,  including  airlines,  trucking,  banking  and  advertising  (Caves,  Christensen  and  Tretheway, 
1984;  Friedlaender,  Winston  and  Wang,  1983;  Glass  and  McKillop,  1992;  Silk  and  Berndt,  1993),  a  lack 
of  appropriately  detailed  data  has  meant  that  with  some  notable  exceptions  we  have  very  little  empirical 
evidence  as  to  their  importance  in  research  (Helfat,  1994).  Empirical  estimates  of  the  impact  of  spillovers 
have  also  been  difficult  to  obtain  (Griliches,  1992).  On  the  one  hand  it  is  difficult  to  distinguish  between 
the  influence  that  spillovers  from  other  industries  have  on  research  productivity  through  their  unmeasured 
effect  upon  input  costs  and  the  influence  that  they  have  directly  on  research  productivity  by  increasing 
the  total  stock  of  knowledge  available  to  researchers.  On  the  other,  the  accurate  estimation  of  spillover 
effects  is  dependent  on  adequate  measures  of  technological  "distance"  and  of  R&D  capital,  both  constructs 
that  are  exceedingly  difficult  to  measure.  Jaffe  (1986)  provides  the  best  example  of  a  careful  attempt  to 
account  for  these  problems,  and  his  work  suggests  that  spillovers  have  important  effects  on  research 
productivity,  but  there  have  been  few  attempts  to  duplicate  his  results  using  disaggregated  data. 

Studies  of  the  relationship  between  size  and  productivity  in  the  pharmaceutical  industry  have  had 


similarly  inconclusive  results.  While  Comanor  (1965),  Vernon  and  Gusen  (1974)  and  Graves  and 
Langowitz  (1993)  found  evidence  for  decreasing  returns  to  scale  in  R&D,  Schwartzman  (1976)  suggested 
that  there  were  significant  economies  of  scale  in  pharmaceutical  research,  and  Jensen  (1987)  found  that 
beyond  a  (quite  small)  threshold  neither  firm  size  nor  the  size  of  the  research  effort  affected  the  marginal 
productivity  of  research  and  development.  Furthermore,  while  the  qualitative  discussions  in  these  papers 
acknowledge  the  probable  importance  of  scope  economies,  the  quantitative  analyses  do  not  distinguish 
them  from  economies  of  scale,  and  although  Dranove  and  Ward  (1991)  and  Gambardella  (1992)  have 
shown  that  in  aggregate  private  research  productivity  is  significantly  correlated  with  the  generation  of 
knowledge  by  the  public  sector,  there  has  been  no  systematic  study  in  this  industry  of  the  importance  of 
spillovers  between  or  within  firms. 

In  general,  as  Cohen  and  Levin  (1989)  suggest,  a  failure  to  control  for  specific  industry  effects 
such  as  variations  in  demand  conditions,  in  technological  opportunity,  and  in  appropriability  conditions, 
coupled  with  an  inability  to  use  project  level  as  opposed  to  aggregate  firm  level  data  may  be  responsible 
for  the  inconclusive  nature  of  existing  results. 

Measuring  Research  Productivity  in  the  Pharmaceutical  Industry 

Pharmaceutical  research  takes  place  in  two  stages:  drug  discovery  and  drug  development.  The 
goal  of  the  drug  discovery  process  is  to  find  a  chemical  compound  that  has  a  desirable  effect  in  a  "screen" 
that  mimics  some  aspect  of  a  disease  state  in  man,  while  the  goal  of  the  drug  development  process  is  to 
ensure  that  compounds  identified  through  the  discovery  process  are  safe  and  effective  in  humans. 
Although  our  data  base  includes  information  about  both  drug  discovery  and  drug  development,  since  the 
two  require  quite  different  sets  of  skills  we  focus  here  upon  the  determinants  of  research  productivity  in 
drug  discovery  as  measured  by  grants  of  important  patents.  In  contrast  to  prior  research  which  has  been 
forced  to  rely  on  publicly  available  firm  level  data,  our  focus  on  drug  discovery  allows  us  to  separate  the 
effect  of  economies  of  scale  in  discovery  from  the  effect  of  any  economies  of  scale  in  drug  development. 

The  measurement  of  "true"  research  productivity  is  a  project  fraught  with  well  known  problems 
(Griliches,  1979).  In  the  ideal  case,  we  would  like  to  measure  the  private  economic  return  to  research. 
Unfortunately  in  this  industry  economic  returns  to  R&D  investment  are  particularly  difficult  to  estimate. 
Returns  to  individual  research  projects  are  highly  skewed  since  they  are  the  final  result  of  a  lengthy  and 
uncertain  process,  driven  both  by  the  uncertainties  of  clinical  testing  and  regulatory  review,  and  by  the 
complexities  introduced  by  marketing,  competitive  activity  and  the  role  of  the  forces  that  determine  the 
demand  for  new  therapies.  These  problems  are  compounded  by  substantial  difficulties  in  estimating  costs 


of  capital  and  in  making  appropriate  adjustments  for  risk  (Baily,  1972;  Di  Masi,  1991;  Grabowski  and 
Vernon,  1990). 

We  therefore  narrow  our  focus  to  the  determinants  of  "technical  success"  in  drug  discovery,  as 
measured  by  patent  grants,  and  our  results  are  most  plausibly  interpreted  as  giving  insight  into  the  factors 
that  make  one  research  group  more  likely  to  generate  plausible  new  candidates  for  development  than 
another.  Economic  productivity  is  ultimately  determined  by  this  ability  in  conjunction  with  a  wide  range 
of  additional  factors,  but  we  believe  insight  into  the  research  process  is  of  interest  in  itself. 

Hypothesis  Formulation 

Prior  work  and  our  qualitative  research  allow  us  to  frame  a  number  of  hypotheses  about 
economies  of  scale,  economies  of  scope  and  the  role  of  spillovers  in  pharmaceutical  research. 

Economies  of  Scale 

Conventional  wisdom  in  the  industry  has  it  that  beyond  a  minimum  threshold,  under  most 
circumstances  there  is  little  to  gain  from  increasing  the  size  of  an  individual  research  program,  and  our 
descriptive  statistics  (see  below)  are  certainly  consistent  with  this  belief.  However  since  pharmaceutical 
research  often  requires  investment  in  substantial  fixed  costs  and  since  the  complexity  of  the  underlying 
science  offers  considerable  scope  for  specialization,  we  expect  significant  economies  of  scale  at  the  firm 
level.  For  example,  to  the  degree  that  inputs  such  as  libraries,  computer  resources,  or  large  pieces  of 
equipment  are  fixed  costs,  we  hypothesize  that  a  larger  research  effort  will  gain  economies  of  scale  by 
spreading  them  over  a  larger  base,  and  while  a  smaller  firm  might  need  to  employ  molecular  biologists 
who  can  work  across  a  relatively  wide  range  of  fields,  we  hypothesize  that  a  larger  one  might  be  able 
to  afford  more  narrowly  focused  specialists.  Thus: 

HI:       Individual  programs  are  not  subject  to  returns  to  scale. 
But 

H2:       There  are  returns  to  scale  in  drug  discovery  at  the  firm  level. 

Economies  of  Scope 

We  also  expect  there  to  be  significant  economies  of  scope  in  drug  discovery.  Economies  of  scope 
exist  when  a  tangible  asset  or  a  human  resource  can  be  used  in  more  than  one  application  at  no  additional 
cost,  and  given  the  nature  of  pharmaceutical  research  we  expect  them  to  be  critically  important  in  shaping 
research  productivity.  Consider,  for  example,  the  benefits  of  investing  in  a  centralized  laboratory  devoted 


to  peptide  chemistry.  Economies  of  scale  exist  if  the  costs  of  the  laboratory  are  partially  fixed,  and  if  the 
lab  can  serve  a  larger  and  larger  discovery  effort  for  a  less  than  proportionate  increase  in  cost.  They  will 
also  exist  if  the  laboratory  can  become  more  efficient  as  it  has  more  work  to  do,  possibly  through  the 
specialization  of  its  members.  Economies  of  scope  exist  if  the  work  of  the  peptide  chemists  is  potentially 
relevant  to  a  wide  range  of  applications,  and  can  be  utilized  in  any  one  of  them  without  diminishing  its 
usefulness  in  the  others.  Economies  of  scope  may  also  arise  if  there  are  internal  spillovers  of  knowledge, 
and  the  results  of  successful  research  in  one  field  have  implications  for  work  in  other  fields.  Our 
qualitative  work  leads  us  to  believe  that  pharmaceutical  research  is  characterized  by  both.  On  the  one 
hand  pharmaceutical  research  requires  the  input  of  a  wide  range  of  highly  skilled  specialists,  much  of 
whose  work  is  relevant  across  multiple  fields.  On  the  other,  discoveries  in  one  field  often  have  important 
implications  for  work  in  another.  Several  important  central  nervous  system  therapies,  for  example,  grew 
out  of  work  that  was  originally  focused  on  the  cardiovascular  system.  Hence: 

H3:       There  are  significant  returns  to  scope  in  drug  discovery. 

We  are  agnostic  as  to  whether  economies  of  scale  and  scope  should  increase  or  decrease  with  firm 
size.  While  formal  treatments  of  firm  scale  and  scope  suggest  that  increasing  firm  size  should  be 
unilaterally  beneficial  (Panzar,  1989),  several  researchers  have  suggested  that  beyond  a  certain  point 
escalating  coordination  costs  and  problems  of  agency  lead  to  diseconomies  of  size  (Holmstrom,  1989; 
Zenger,  1994). 

The  Role  of  Spillovers 

Our  qualitative  research  also  leads  us  to  believe  that  spillovers  between  firms  play  a  major  role 

in  research  productivity.  The  industry  is  characterized  by  high  rates  of  publication  in  the  open  scientific 

literature,  and  many  of  the  scientists  with  whom  we  spoke  stressed  the  importance  of  keeping  in  touch 

with  the  science  conducted  both  within  the  public  sector  and  by  their  competitors.1  Nearly  all  of  them 

had  a  quite  accurate  idea  of  the  nature  of  the  research  currently  being  conducted  by  L.eir  competitors, 

and  they  often  described  the  ways  in  which  their  rivals'  discoveries  had  been  instrumental  in  shaping  their 

own  research.  Thus  we  hypothesize: 

H4:       Research  productivity  is  positively  associated  with  spillovers  of  knowledge  between 
firms. 

Notice,  however  that  the  impact  of  the  efforts  of  competing  firms  on  own  research  productivity 


1  Measuring  the  impact  of  "upstream"  spillovers  from  the  public  sector,  though  very  important,  is 
beyond  the  scope  of  this  paper,  but  will  be  addressed  in  future  work. 


is  ambiguous.  On  the  one  hand,  in  a  world  in  which  firms  are  in  a  first-past-the-post  "race"  with  each 
other  to  reach  a  particular  target,  all  other  things  equal  rivals'  success  imposes  an  "exhaustion  externality" 
on  competitors  and  own  research  productivity  will  be  negatively  correlated  with  competitors'  efforts 
(Reinganum,  1989).  On  the  other  hand,  firms  may  benefit  from  competitors'  research  since,  all  other 
things  equal,  if  there  are  extensive  spillovers  of  knowledge  between  firms,  the  productivity  of  a  research 
team  may  increase  as  others  work  in  the  same  field.  For  example,  when  Squibb  announced  that  they  had 
found  an  orally  active  ACE  inhibitor,  a  potent  hypertensive  therapy,  several  competing  firms  were  able 
to  take  advantage  of  the  announcement  to  focus  their  own  research  efforts  (Henderson,  1994). 

In  both  theory  and  practice,  of  course,  these  two  effects  interact  with  each  other  in  complex  ways 
(Spence,  1984).  In  a  related  paper,  "Racing  to  Invest?:  The  Dynamics  of  Competition  in  Ethical  Drug 
Discovery"  (Cockburn  and  Henderson,  1994),  we  explore  this  issue  in  more  detail.  For  the  purposes  of 
the  analysis  presented  here,  we  hypothesize  that,  all  other  things  equal,  a  positive  correlation  between 
research  success  in  any  given  area  across  competing  firms  is  consistent  with  the  presence  of  significant 
spillovers  or  research  complementarities  between  firms,  while  a  negative  correlation  is  consistent  with 
a  research  environment  in  which  rivals'  research  efforts  are  substitutes  rather  than  complements, 
spillovers  are  limited  and  research  productivity  is  primarily  driven  by  an  "exhaustion  externality". 

Changes  Over  Tune 

We  hypothesize  that  the  last  thirty  years  have  seen  significant  changes  in  the  roles  of  scale,  scope 
and  spillovers  in  shaping  research  productivity  as  the  technology  of  drug  discovery  has  changed 
dramatically  (Cocks,  1975;  Caglarcan,1978;  Gross,  1983;  Spilker,  1989).  Thirty  years  ago  drug  discovery 
was  principally  a  matter  of  the  large  scale  screening  of  naturally  occurring  compounds.  In  general  the 
"mechanism  of  action"  of  most  drugs  -  the  specific  biochemical  and  molecular  pathways  that  were 
responsible  for  their  therapeutic  effects  -  were  not  well  understood.  A  firm  searching  for  a  treatment  for 
hypertension,  for  example,  might  inject  thousands  of  compounds  into  hypertensive  rats  in  the  hope  of 
finding  something  that  reduced  blood  pressure.  Firms  employed  few  scientific  specialists  beyond 
pharmacologists  and  medicinal  chemists,  and  the  most  important  spillovers  between  firms  took  the  form 
of  the  knowledge  that  compounds  of  a  particular  type  showed  promise  in  the  treatment  of  a  particular 
disease.  For  example,  when  Sir  James  Black  announced  the  discovery  of  the  first  beta  blocker  many  of 
ICI's  competitors  immediately  started  to  search  aggressively  for  compounds  of  related  chemical  structure 
that  might  have  equally  dramatic  therapeutic  effects. 

The  biological  revolution  of  the  last  twenty  years  has  changed  this  process  dramatically,  moving 


it,  in  popular  terminology,  from  a  process  of  "random  search"  to  one  of  "rational  drug  design." 
Significant  advances  in  the  understanding  of  both  the  mechanism  of  action  of  many  existing  drugs  and 
of  the  biochemical  roots  of  many  diseases  has  made  it  possible  to  design  significantly  more  sophisticated 
screens.  If,  to  use  one  common  analogy,  the  action  of  a  drug  on  a  "receptor"  in  the  body  is  similar  to 
that  of  a  key  fitting  into  a  lock,  contemporary  drug  researchers  often  have  a  much  better  developed  sense 
of  what  the  "lock"  may  look  like  and  with  it  a  much  more  precisely  tuned  method  to  screen  and 
synthesize  potentially  effective  compounds.  The  request  "find  me  something  that  will  lower  blood 
pressure  in  rats"  can  be  replaced  with  the  request  "find  me  something  that  inhibits  the  action  of  the 
angiotensin  II  converting  enzyme."  At  the  same  time,  an  improved  understanding  of  molecular  kinetics, 
of  the  physical  structure  of  molecular  receptors  and  of  the  relationship  between  chemical  structure  and 
mechanism  of  action  have  sharpened  the  search  for  suitable  "keys". 

This  profound  change  in  the  "technology"  of  pharmaceutical  research  may  have  significantly 
altered  the  relative  importance  of  scale,  scope,  and  spillovers  in  determining  research  productivity.  The 
impact  on  scale  economies  is  hard  to  determine  ex  ante.  While  the  importance  of  scale  economies  inherent 
in  activities  such  as  the  screening  of  thousands  of  compounds  may  have  declined,  the  transition  to 
mechanism-based  research  has  also  greatly  increased  both  the  number  of  scientific  specialties  that  are 
central  to  drug  discovery  and  the  pace  of  knowledge  generation  within  the  industry.  Modern  drug 
discovery  relies  upon  the  input  of  scientists  skilled  in  a  very  wide  range  of  disciplines,  many  of  which 
are  advancing  at  an  extraordinarily  rapid  rate.  Thus  we  hypothesize  that  the  opportunities  to  exploit 
economies  of  scale  arising  from  specialization  may  well  have  increased: 

H5:      Economies  of  scale  in  drug  discovery  may  have  increased  in  the  last  thirty  years. 

Our  beliefs  about  the  relative  importance  of  returns  to  scope  are  much  firmer.  As  drug  discovery 

has  become  increasingly  directed  by  an  understanding  of  the  fundamental  physiological  mechanisms, 

knowledge  acquired  in  one  area  has  become  more  likely  to  have  implications  for  research  conducted 

elsewhere  within  the  firm: 

H6:       Economies  of  scope  in  drug  discovery  have  significantly  increased  in  the  last  thirty 
years. 

Finally,  we  believe  that  there  may  have  been  a  concomitant  change  in  the  role  of  spillovers  in 

shaping  research  productivity.  When  one's  competitors  are  merely  screening  compounds  there  is  little  to 

be  learned  from  them  unless  they  find  a  particularly  promising  molecule.  But  when  they  are  actively 

investing  in  the  generation  of  new  physiological  and  biochemical  knowledge,  even  knowledge  of  their 

false  starts  and  failures  may  help  to  shape  one's  own  research  program.  Thus  we  hypothesize: 
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H7:       The  impact  of  inter-Jinn  spillovers  on  research  productivity  has  significantly  increased 
over  the  last  thirty  years. 


2.  Specification  of  the  Econometric  Model 

We  test  these  hypotheses  by  estimating  a  "knowledge  production  function"  of  the  type  proposed 
by  Griliches  (1984),  in  which  new  knowledge  is  generated  using  both  accumulated  "knowledge  capital" 
and  resources  expended  in  the  current  period.  Rather  than  look  for  economies  of  scope  and  scale  in 
restrictions  on  the  parameters  of  cost  functions  (Caves,  Christensen  and  Tretheway,  1984;  Friedlaender, 
Winston  and  Wang,  1983;  Glass  and  McKillop,  1992;  Silk  and  Berndt,  1993),  we  use  direct  measures 
of  scope,  scale  and  spillovers  as  shift  variables  modifying  the  relationship  between  research  expenditures 
and  the  generation  of  knowledge,  where  knowledge  is  measured  as  grants  of  "important"  patents.2 
Pharmaceutical  companies  patent  prolifically,  and  patents  are,  of  course,  a  rather  noisy  measure  of 
research  success,  in  part  because  the  significance  of  individual  patents  varies  widely.  We  control  for  this 
by  counting  only  "important"  patents,  where  an  "important"  patent  is  defined  as  one  that  was  granted  in 
two  of  the  three  major  jurisdictions:  Japan,  Europe  and  the  United  States.3  We  think  of  these  patents  as 
a  useful  measure  of  the  generation  of  new  knowledge,  which  is  the  "raw  material"  input  to  subsequent 
stages  in  drug  development.4  We  hypothesize  that  patent  counts  are  generated  by  a  production  function 
Y  —  f  (X,(i),  where  Y  is  patent  counts,  X  is  a  vector  of  inputs  to  the  drug  discovery  process,  and  /3  is  a 
vector  of  parameters.  We  have  no  priors  about  the  "true"  functional  form,  so  the  model  estimated  should 
be  thought  of  as  a  local  approximation. 

Some  previous  studies  have  looked  at  the  dynamics  of  the  R&D/patents  relationship  by  estimating 
a  lag  structure  on  the  input  variables.  Rather  than  make  assumptions  about  distributed  lags  (and  have  to 


2  The  use  of  structural  models  of  this  type  to  explore'the  determinants  of  performance  has  been  a 
matter  of  some  debate  (Rumelt,  1991).  We  employ  it  as  a  technique  whose  limitations  are  quite  well 
understood  and  as  a  first  step  towards  exploring  a  complex  and  multifaceted  phenomenon. 

3  We  would  have  preferred  to  have  been  able  to  use  citation  weighted  patents  as  our  measure  of 
output.  Unfortunately  since  the  U.S.  patent  classifications  do  not  map  directly  into  our  definitions  of 
research  programs  we  would  have  had  to  buy  citation  data  directly  from  Derwent  Publications.  This 
proved  to  be  prohibitively  expensive. 

*      Patents  are  clearly  only  one  possible  measure  of  research  output,  and  later  work  will  explore 
the  use  of  alternative  measures  such  as  the  number  of  compounds  obtaining  the  various  levels  of 
regulatory  approval.  Preliminary  analysis  suggests  that  Investigational  New  Drug  Applications  (INDs) 
are  highly  correlated  with  "important"  patents,  suggesting  that  our  output  measure  does  indeed 
capture  an  important  dimension  of  performance. 


throw  out  much  of  our  data  in  order  to  have  4  or  5  lags  present  in  every  program)  we  include  "stocks" 
of  the  input  variables  as  explanatory  variables.  The  annual  flows  are  reasonably  smooth,  so  this  is 
equivalent  in  many  senses  to  imposing  a  geometric  lag  structure,  where  we  have  assumed  a  depreciation 
rate  rather  than  estimated  one.  Note  that  given  a  smooth  series  for  the  flow  variable,  it  will  be  difficult 
as  a  practical  matter  to  identify  the  estimated  coefficient  on  the  stock  variable  separately  from  the 
depreciation  rate,  making  our  assumption  of  a  particular  depreciation  rate  a  second-order  problem. 

Since  the  dependent  variable  in  this  relationship  only  takes  on  non-negative  integer  values,  some 
type  of  discrete  dependent  variable  model  is  dictated.  We  assume  that  patent  counts  are  generated  by  a 
Poisson  process,  which  is  appropriate  if  we  are  prepared  to  model  research  results  as  the  outcome  of  an 
unknown  (but  large)  number  of  Bernoulli  trials  with  a  small  probability  of  success.  This  model  certainly 
captures  some  aspects  of  drug  discovery,  such  as  screening.  It  may  be  less  appropriate  for  mechanism- 
based  research,  and  we  explore  the  robustness  of  some  of  our  results  to  the  use  of  alternative  estimation 
techniques  below. 

We  model  the  single  parameter  of  the  Poisson  distribution  function,  X,  as  a  function  of  some 
explanatory  variables,  X,  and  parameters  /S  in  the  standard  fashion: 

E[YU]  =  X,  =  exp(X„0) 

to  guarantee  non-negativity  of  X,  and  estimate  the  parameters  by  maximum  likelihood  in  the  standard 
way.  Note  that  the  choice  of  whether  to  use  explanatory  variables  in  levels  or  logs  has  important 
implications  in  this  model.  If  we  use  levels,  the  estimated  elasticity  of  output  with  respect  to  each 
explanatory  variable  will  vary  with  the  magnitude  of  the  variable  by  assumption.  Conversely,  if  an 
explanatory  variable  enters  in  logs  we  impose  the  constraint  that  the  elasticity  is  constant  over  its  range 
of  variation.  In  the  case  of  spending  on  research,  for  the  results  presented  here  we  maintain  the  null 
hypothesis  of  constant  elasticity  of  Kwith  respect  to  research  expenditures,  allowing  easy  comparison  with 
previous  work.3  In  exploratory  work  we  tested  for  non-linearities  in  this  relationship.  We  found  that 
estimated  coefficients  on  quadratic  terms  in  the  log  of  research  were  very  small  and  insignificant.  For 
research  spending  measured  in  levels,  additional  terms  were  significant,  but  expansion  suggested  that  the 
elasticity  of  Y  with  respect  to  total  research  spending  was  indeed  constant  over  the  range  of  our  data. 


3      Since  we  have  a  fair  number  of  observations  in  which  the  R&D  variables  are  zero,  using  the 
log  of  the  R&D  variables  introduces  the  complication  that  we  cannot  take  the  log  of  zero.   Following 
previous  work,  we  deal  with  this  by  setting  the  log  of  R&D  equal  to  zero  in  such  cases,  including  an 
appropriately  coded  dummy  variable  to  account  for  this  in  the  regression. 

10 


We  have  no  strong  priors  about  the  appropriate  way  to  include  the  other  explanatory  variables 
and  we  report  results  obtained  by  entering  these  variables  in  levels.  Many  of  these  variables  also  have 
substantial  numbers  of  zeros,  and  this  avoids  numerical  problems  in  the  estimation  caused  by  near 
collinearity  of  the  VARIABLE =0  dummy  variables  with  other  regressors,  but  very  similar  results  were 
obtained  in  exploratory  work  when  all  explanatory  variables  were  entered  in  logs. 

A  useful  way  to  think  about  this  specification  is  to  divide  the  explanatory  variables  into  two 
classes:  the  research  expenditure  variables,  R,  and  other  variables,  Z.  The  estimated  function  has  a  direct 
proportionate  relationship  between  research  expenditures  and  patent  counts,  mediated  by  a  set  of 
multiplicative  "shift  variables"  Z 

E[YU]  =  X,  =  exp(j81og(i?ft)n^) 
=  RS  •  exp(7ZJ 

where  Z  includes  both  our  measures  of  scope,  scale  and  spillovers  and  a  constant  term  and  firm  and 
therapeutic  class  dummies.  Re-writing  the  equation  in  logs, 

logfXJ  =  01og(J?,)  ♦  yZ„ 

thus  we  can  interpret  the  coefficient  on  log(R)  directly  as  the  elasticity  of  Y  with  respect  to  research, 
while  the  elasticities  of  the  Z  additional  variables  are  yZ. 

The  assumption  that  the  dependent  variable  is  distributed  Poisson  is  quite  strong:  like  most  other 
data  of  this  type,  the  mean = variance  property  of  the  Poisson  distribution  is  violated  here.  In  the  presence 
of  such  overdispersion,  although  the  parameters  /3  will  be  consistently  estimated,  their  standard  errors  will 
typically  be  under-estimated,  leading  to  spuriously  high  levels  of  significance.  Overdispersion  is  often 
interpreted  as  evidence  that  the  statistical  model  is  miss-specified  in  the  sense  that  there  may  be 
unobserved  variables  in  the  equation  for  X, 

E[Y„]  =X,  -«p(XJJ+0 

As  is  well  known  (see  Hausman,  Hall,  Griliches  (1984),  Hall,  Griliches,  Hausman  (1986))  if  c  is 
distributed  gamma,  then  it  can  be  integrated  out  giving  Y  distributed  as  a  negative  binomial  variate.  If 
(  is  not  truly  gamma,  however,  the  maximum  likelihood  estimates  of  the  coefficients  of  the  model  will 
be  inconsistent.  Gourieroux,  Montfort,  and  Trognon  (1984)  suggest  using  a  quasi-generalized  pseudo- 
maximum  likelihood  estimator  based  on  the  first  two  moments  of  the  distribution  of  Y,  which  gives 
consistent  estimates  for  e  drawn  from  a  wide  variety  of  distributions.  The  GMT  estimator  is  just  weighted 
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non-linear  least  squares  estimates  of  the  model 

Y„  =  exp(Xj3hCjl 

with  weights  derived  from  the  relation  VAR[Y]  =  E[YJ  (1+r?  E[Y])  using  initial  consistent  estimates  of 
/S.  Below  we  present  alternate  estimates  of  some  of  our  regression  models  using  maximum-likelihood 
estimation  of  the  Poisson  and  Negative  Binomial  models,  non-linear  least  squares  (with  robust  standard 
errors),  and  the  GMT  estimator. 

3.  Sources  and  Construction  of  the  Data  Set. 

This  paper  uses  a  data  set  obtained  as  part  of  a  larger  study  of  research  productivity  in  the 
pharmaceutical  industry.  The  larger  study  has  both  qualitative  and  quantitative  components.  The 
qualitative  study  draws  upon  the  medical  and  scientific  literature  and  upon  a  program  of  detailed  field 
interviews,  and  was  designed  both  to  shape  the  choice  of  variables  and  hypotheses  explored  in  the 
quantitative  study  and  to  explore  the  role  of  less  easily  measurable  factors  such  as  organizational  structure 
in  driving  research  productivity. 

The  quantitative  study  draws  upon  data  about  spending  and  output  at  the  research  program  level 
obtained  from  the  internal  records  of  ten  pharmaceutical  firms.  Although  for  reasons  of  confidentiality 
we  cannot  describe  these  firms,  we  can  say  that  they  cover  the  range  of  major  R&D-performing 
pharmaceutical  manufacturers,  that  they  include  both  American  and  European  firms,  and  that  we  believe 
that  they  are  not  markedly  unrepresentative  of  the  industry  in  terms  of  size  or  technical  or  commercial 
performance.  Between  them  they  represent  about  25%  of  the  pharmaceutical  research  conducted  world 
wide.  This  section  offers  a  brief  description  of  the  important  variables  used  in  the  econometric  analysis, 
and  presents  descriptive  statistics  for  the  data  set. 

The  data  set  used  in  this  paper  is  an  unbalanced  panel  indexed  by  firm,  research  program,  and 
year.  With  a  complete,  rectangular,  panel  we  would  have  1 1,400  observations,  made  up  of  ten  firms,  38 
research  programs,  and  up  to  30  years  of  data.  In  practice  not  all  of  these  observations  are  available:  the 
average  time  period  for  which  we  have  complete  data  is  on  average  just  under  20  years  per  firm,  and  not 
all  firms  are  active  in  all  research  areas.  Our  working  sample  is  drawn  from  a  data  base  which  has  5543 
potentially  useful  observations.  After  deleting  missing  values,  grossly  problematic  data  and  peripheral 
research  areas  we  are  left  with  4930  observations.  The  number  of  observations  per  firm  varies  from  over 
1000  to  less  than  100,  with  a  mean  of  489.8.  For  each  observation  we  have  data  on  both  inputs  and 
outputs  to  the  research  process.  Our  measures  of  input  include  person  years  and  research  spending  in 
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discovery  and  development,  and  our  measures  of  output  include  patents,  INDs,  NDAs,  new  drug 
introductions,  sales  and  market  share. 

Assembling  the  data  in  a  consistent  and  meaningful  format  required  considerable  effort.  In  nearly 
every  case  the  process  of  data  collection  was  an  iterative  one,  involving  close  collaboration  between  the 
researchers  and  key  personnel  from  the  participating  companies.  The  majority  of  the  data  were  collected 
specifically  together  for  the  purposes  of  this  study.  Each  firm  spent  some  months  assembling  its  data, 
usually  from  primary  documents,  and  the  full  data  collection  effort  took  nearly  two  years.  We  worked 
hard  to  ensure  that,  as  far  as  was  possible,  definitions  of  research  program  and  of  expense  grouping  were 
standard  across  firms,  data  were  collected  at  the  same  level  of  aggregation,  and  overhead  expenses  were 
treated  in  a  consistent  way,  and  we  took  steps  to  ensure  that,  wherever  possible,  the  data  included 
worldwide  research  spending,  not  just  US  facilities. 

Data  was  collected  by  research  program  rather  than  by  broad  therapeutic  class  or  by  individual 
project  since  we  believe  that  analyzing  the  problem  in  this  way  best  reflects  the  dynamics  of  discovery 
research.  A  grouping  by  therapeutic  class  is  too  general:  "cardiovascular  research,"  for  example,  includes 
research  into  widely  different  areas,  including  hypertension,  cardiotonics,  antiarrhythmics  and 
hyperlipoproteinemia.  However  analysis  of  data  by  individual  project  is  difficult  and  misleading.  Not  only 
is  it  difficult  to  assign  effort  to  particular  drug  candidates  with  any  accuracy,  but  the  notion  that  research 
productivity  is  best  measured  at  this  level  raises  some  conceptual  difficulties.  A  research  program 
typically  continues  over  many  years,  and  at  the  discovery  stage,  the  firm  invests  in  the  program,  rather 
than  in  particular  candidates.  The  identification  of  a  drug  development  candidate  is  an  indication  of  the 
success  of  the  program,  and  retrospectively  assigning  resources  to  its  generation  may  introduce  serious 
biases  into  the  analysis. 

Classification  of  our  data  into  therapeutic  areas  is  an  important  factor  in  our  analysis,  since  it 
drives  both  the  fundamental  organization  of  our  data,  and  the  notion  of  spillovers.  We  classified  our  data 
into  therapeutic  areas  using  the  IMS  Worldwide  class  definitions.  (A  more  detailed  discussion  oi  this  issue 
is  given  in  the  Appendix).  We  distinguish  between  two  tiers  of  aggregation:  the  detailed  "research 
program"  level,  and  a  more  aggregated  "therapeutic  class"  level  which  groups  related  programs  into 
therapeutic  areas.  For  example,  the  therapeutic  class  "Central  Nervous  System  Drugs"  includes  the 
research  programs  "Depression,"  "Anxiety"  and  "Analgesics." 

Full  details  of  the  construction  of  the  variables  used  in  the  study  are  given  in  the  Appendix.  Our 
primary  variables  are  DISCOVERY,  defined  as  "expenditure  relating  primarily  to  the  production  of  new 
compounds",  which  excludes  clinical  development  work  and  is  measured  in  constant  dollars;  and 
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PATENTS,  a  count  of  "important"  patents.  We  deflate  discovery  using  the  biomedical  research  and 
development  price  index  issued  by  the  National  Institutes  of  Health.  These  measures  of  inputs  and  outputs 
are  matched  by  year  and  research  program.  We  count  patents  by  their  year  of  application,  and  define 
"importance"  by  the  fact  that  the  patent  was  granted  in  two  of  the  three  major  markets:  the  USA,  Japan, 
and  the  European  Community.  Applying  this  criterion  screens  out  large  numbers  of  patents:  for  example 
up  to  60%  of  the  number  filed  in  the  US  in  any  given  year  are  discarded.  The  first  year  in  which  we 
were  able  to  obtain  this  data  is  1961,  and  because  patents  grants  may  lag  applications  by  as  much  as  four 
years  in  the  United  States  and  six  in  Japan  we  use  only  observations  for  the  years  1961-1988. 

We  measure  the  scale  of  the  firm's  research  effort  by  SIZE,  total  discovery  spending  that  year. 
There  are  some  grounds  for  believing  that  the  relevant  measure  of  size  is  the  scale  of  the  entire  firm,  as 
captured  by,  for  example,  total  sales,  or  total  employees,  and  this  measure  has  been  used  in  a  number 
of  previous  studies  of  the  industry.  We  experimented  with  these  types  of  measures  in  exploratory  work, 
but  obtained  poor  results.  This  is  perhaps  not  surprising  given  that  any  size  effect  captured  by  a  variable 
such  as  total  firm  sales  is  likely  to  be  confounded  with  factors  such  as  demand  (Jensen,  1987;  Graves  and 
Langowitz,  1993). 

In  order  to  test  for  economies  of  scope  we  constructed  a  variety  of  measures  of  the  diversity  of 
the  firm's  research  effort  across  therapeutic  classes.  SCOPE  is  a  count  of  the  number  of  research 
programs  in  which  firm  spent  more  than  $500,000,  in  constant  1986  dollars,  in  a  single  year.  Our 
qualitative  work  led  us  to  believe  that  this  level  —  roughly  equivalent  to  the  employment  of  three  to  four 
Ph.D.  level  scientists  -  was  the  smallest  size  program  such  that  the  firm  could  reasonably  hope  to 
maintain  sufficient  "absorptive  capacity"  in  an  area  so  as  to  be  able  to  take  advantage  of  results  generated 
elsewhere  within  the  firm  or  in  the  industry  (Cohen  and  Levinthal,  1989).  We  experimented  with  a 
variety  of  alternative  thresholds,  but  in  general  the  use  of  different  cut  off  points  yielded  similar  or 
insignificant  results.  We  also  constructed  SMALL,  a  count  of  research  programs  in  which  the  firm  spent 
less  than  this  threshold  to  test  the  hypothesis  that  diseconomies  of  scope  could  arise  through  the 
multiplication  of  coordination  costs. 

To  explore  a  different  dimension  of  diversification,  we  computed  H ERF,  the  Herfindahl  index 
of  the  research  portfolio,  as  a  measure  of  the  "focus"  of  the  firm's  research  effort.  HERF  is  defined  as: 


Herf^s 


a 

2 


where  sp  is  the  share  of  the  /th  research  program  in  the  rth  firm  in  the  year  t  and  there  are  n  research 
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programs.  Note  that  SCOPE  and  HERF  are  not  simple  transformations  of  each  other.  In  general, 
larger  firms  run  more  programs,  and  thus  tend  to  have  higher  values  of  SCOPE  and  lower  values  of 
HERF  by  construction.  Nonetheless  some  of  the  smaller  firms  in  the  sample  have  very  diverse  portfolios 
with  activities  spread  evenly  over  a  wide  range  of  programs  and  some  of  the  larger  firms  concentrate  their 
efforts  in  highly  focused  portfolios:  the  Pearson  correlation  coefficient  between  SCOPE  and  HERF  is  only 
0.66. 

We  hypothesize  that  both  SCOPE  and  HERF  have  non-linear  relationships  to  productivity:  that 
both  very  highly  focused  and  very  disparate  efforts  will  be  on  average  less  productive  than  those  that  are 
"appropriately"  focused  and  "appropriately"  diverse.  Highly  focused  firms  will  be  less  productive  since 
they  will  be  less  able  to  take  advantage  of  internal  economies  of  scope,  and  very  unfocused  programs  will 
be  less  productive  since  they  will  incur  higher  coordination  costs.  Note  that  SCOPE,  HERF,  SMALL  and 
SIZE  are  unique  only  to  firm  and  year,  not  to  firm,  year,  and  research  program. 

We  then  construct  variables  intended  to  capture  the  effects  of  spillovers  both  within  and  between 
firms.  For  each  observation,  we  start  with  the  basic  data  on  each  program's  "own"  annual  flows  of 
patents.  We  then  construct  spillover  variables  at  two  levels.  We  capture  spillovers  internal  to  the  firm  - 
one  of  the  mechanisms  through  which  the  firm  realizes  economies  of  scope  -  by  measuring  the  output 
of  all  of  the  other  research  programs  within  the  relevant  broad  therapeutic  class.  For  example,  in  the  case 
of  a  program  in  Depression  we  look  for  spillovers  from  the  firm's  other  Central  Nervous  System 
programs.  This  is  equivalent  to  using  a  binary  measure  of  technological  distance:  programs  within  the 
same  therapeutic  class  are  assumed  to  be  plausible  sources  of  spillovers  whereas  other  programs  are  not. 
We  construct  two  measures  of  spillovers  between  firms.  On  the  one  hand  we  count  competitors'  output 
in  the  same  narrow  research  area,  and  on  the  other  we  count  competitors'  output  in  all  the  other  programs 
in  the  wider  therapeutic  class.6  Finally,  we  construct  "stocks"  for  all  of  these  variables  by  accumulating 
the  flows  over  time  with  a  20%  depreciation  rate,  and  also  a  "news"  variable  by  subtracting  20%  of  the 
stock  at  the  beginning  of  the  year  from  the  annual  flow. 


6     We  chose  as  the  sample  from  which  to  construct  our  measure  of  competitors'  patents  the  ten 
firms  that  have  given  us  data  together  with  19  other  firms  who  have  been  consistently  in  the  top  40 
world  wide  pharmaceutical  firms  in  terms  of  R&D  dollars  and  sales.  Note  that  these  19  firms  are  only 
a  fraction  of  the  population  of  other  firms  generating  spillovers,  and  the  estimated  coefficient  on  our 
external  spillover  variables  may  therefore  tend  to  overstate  the  magnitude  of  this  effect.  However  we 
believe  that  these  firms  are  a  representative  sample  of  the  industry  as  a  whole,  and  account  for  the 
majority  of  the  industry's  spillover  pool. 
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Descriptive  Statistics 

Table(l )  and  Figures  (1)  to  (4)  present  summary  statistics  describing  the  data.  Averaging  across 
all  firms,  research  programs,  and  years,  our  firms  spent  on  average  $  1.06m  1986  dollars  on  discovery 
per  program  per  year  for  an  average  of  1.8  "important"  patents.  Each  "important"  patent,  on  average, 
thus  cost  about  $600,000  in  1986  dollars.  The  average  firm  in  our  sample  is  highly  diversified,  investing 
substantially  (more  than  half  a  million  dollars)  in  just  over  ten  programs  a  year.  In  addition  these  firms 
on  average  invest  more  than  ten  thousand  dollars  a  year  in  a  further  six  programs. 

All  of  the  key  variables  show  a  substantial  amount  of  variation.  Perhaps  the  most  dramatic  effect 
visible  in  the  time  series  aggregates  is  the  continuing  increase  in  discovery  spending  despite  the  fact  that 
the  mean  cost  per  important  patent  rose  dramatically  from  197S  onwards.  Figure  (1)  shows  the  evolution 
of  mean  discovery  spending  per  program  and  mean  important  patents  obtained  per  program  over  time. 
Mean  discovery  spending  per  program  almost  tripled  in  real  terms  over  our  sample  period,  from  just  over 
$0.6m  in  1965  to  around  $1 .9m  in  1988.  This  trend  reflects  both  an  expansion  in  the  scale  of  the  research 
effort  and  a  change  in  its  nature,  as  the  adoption  of  rational  drug  discovery  techniques  has  increased  the 
demand  for  sophisticated  equipment  and  more  highly  skilled  researchers.  At  the  same  time  mean  patents 
obtained  per  program  fell  dramatically  after  1978,  from  a  high  of  just  over  3  to  less  than  0.S  in  1988. 
In  combination,  these  two  trends  imply  that  mean  important  patents  per  million  real  discovery  dollars  fell 
from  a  high  of  2.4  in  1978  to  less  than  0.S  after  1986.  The  same  trends  are  evident  in  data  aggregated 
to  the  level  of  the  firm:  between  1965  and  1990  mean  discovery  spending  per  firm  almost  tripled  in  real 
terms,  from  just  over  $  18m  in  1965  to  around  $54m  in  1990 ,7  while  mean  patents  obtained  per  firm  fell 
precipitously. 

The  decline  in  patenting  rates  may  reflect  the  general  downward  trend  that  has  characterized  US 
and  European  firms  over  the  last  decade,  some  of  which  simply  reflects  institutional  factors  such  as  the 
resource  constraints  imposed  on  the  U.S.  Patent  Office  in"  the  early  eighties  (Griliches,  1992).  However 
the  decline  in  our  data  is  significantly  more  precipitous  than  that  characteristic  of  the  economy  in  general, 
and  we  suspect  that  more  fundamental  factors  may  be  at  work. 

One  possibility  is  that  the  transition  to  more  rational  methods  of  drug  discovery  has  led  to  a 
change  in  patenting  strategies  and  an  increase  in  the  significance  of  each  patent:  real  research  output  may 


This  number  may  seem  surprisingly  small,  given  the  size  of  the  research  budgets  announced 
by  some  of  the  larger  firms  in  the  industry  in  1993.  Recall  that  it  is  calculated  as  an  arithmetic  mean, 
that  it  is  measured  in  1986  dollars  and  that  it  includes  only  resources  spent  on  discovery,  not  clinical 
development. 
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have  remained  constant  or  even  risen.  Another  possibility  is  that  the  industry  is  approaching  technological 
exhaustion  (Grabowski,  Vernon  and  Thomas,  1978;  Grabowski  and  Vernon,  1990),  but  rates  of 
investment  in  research  may  be  continuing  to  accelerate  despite  the  decline  in  patenting  (Figure  (2»  in 
response  to  what  Sutton  (1991)  has  identified  as  the  need  to  maintain  differentiation  in  the  product 
market. 

Looking  next  at  the  program  level  data,  some  surprising  characteristics  of  the  research  portfolio 
emerge,  highlighting  the  pitfalls  of  using  data  aggregated  to  the  level  of  the  firm.  Figure  (3)  presents  the 
size  distribution  of  discovery  expenditures  per  research  program.  The  mean  firm  in  our  sample  invests 
heavily  in  a  few  large  programs  but  also  invests  in  a  large  number  of  much  smaller  programs.  For  over 
44%  of  the  program-years  in  our  sample,  no  expenditures  on  discovery  were  recorded  at  all.  To  the  best 
of  our  knowledge  these  are  "genuine"  zeros  and  reflect  intermittent  expenditures  over  time,  or  programs 
which  were  "alive"  in  that  the  firm  was  consistently  obtaining  patents  or  was  engaged  in  clinical 
development  in  the  area.'  Of  the  remainder,  about  one  quarter  of  cases  involved  spending  less  than 
$0.2m  1986  dollars  per  program  per  year,  and  about  forty  percent  spent  less  than  our  critical  threshold 
of  $0.5m  1986  dollars  per  program  per  year.  At  the  other  tail  of  the  distribution,  just  over  2%  of  cases 
involved  expenditures  of  more  than  $10m  1986  dollars  per  program  per  year. 

On  the  output  side,  the  distribution  of  annual  counts  of  important  patents  per  program  (Figure 
(4))  is  also  highly  skewed.  About  half  of  our  observations  show  zero  output  per  program-year,  and  almost 
90%  of  the  counts  are  less  than  5  per  year.  Patent  counts  also  vary  significantly  across  firms,  across 
therapeutic  classes  and  across  time.  While  the  mean  number  of  patents  per  program  is  1.7  per  year,  this 
varies  across  firms  from  3. 1  to  0.09  per  year,  and  across  therapeutic  classes  from  0.7  per  year  for  work 
in  the  genito-urinary  system  to  3.7  per  year  for  cardiac  and  circulatory  products.  There  are  also  very 
substantial  differences  in  the  average  level  of  expenditure  across  therapeutic  classes:  mean  discovery 
expenditures  per  program  range  from  roughly  $900,000  per  year  in  alimentary  tract  and  metabolic 
research  to  over  $4.0m  per  year  in  anticancer  research. 

4.  The  Empirical  Results. 

In  this  section  we  present  our  estimation  results.  We  begin  by  presenting  results  obtained  by 
aggregating  data  to  the  firm  level  to  allow  comparison  with  previous  work  (Table  (2)).  We  then  turn  to 


*      Deleting  these  observations  from  the  data  set  does  not  substantially  change  either  the 
magnitude  or  the  significance  of  our  estimated  coefficients 
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analysis  at  the  program  level.  Table  (3)  presents  results  obtained  from  ths  full  sample,  Table  (4)  explores 
the  robustness  of  the  Poisson  specification,  and  Table  (5)  explores  alternative  measures  of  scope.  In 
Tables  (2)  through  (5)  we  control  for  any  change  in  regime  after  1978  by  allowing  for  changes  in  the 
intercept  and  in  the  time  trend.  1978  was  the  year  after  the  publication  of  Cushman  and  Ondettis'  seminal 
paper  describing  their  synthesis  of  an  orally  active  ACE  inhibitor,  often  described  as  one  of  the  first 
examples  of  successful  "rational"  drug  design  (Cushman,  Cheung,  Sabo  and  Ondetti,  1977).  It  was  also 
the  year  in  which  the  patents  obtained  per  discovery  dollar  for  the  firms  in  our  sample  began  its 
precipitous  fall.9  In  Table  (6)  we  test  for  changes  in  the  roles  of  scope,  scale  and  spillovers  over  time  by 
splitting  the  sample  into  pre-1978  and  post-1978  subsamples. 

Analysis  of  firm  level  data. 

Table  (2)  presents  the  results  of  estimating  the  knowledge  production  function  discussed  in  section 
(2)  using  data  aggregated  to  the  firm  level.  This  aggregation  generates  a  rather  small  sample,  with  only 
181  observations  on  our  10  firms,  and  results  should  be  therefore  be  treated  with  caution.  In  common 
with  previous  work  we  find  no  evidence  of  returns  to  scale:  the  implied  long  run  elasticity  of  "important" 
patent  output  with  respect  to  research  spending  in  every  model  is  between  0.4  and  0.5.  This  is  somewhat 
lower  than  the  results  obtained  in  previous  studies  of  the  pharmaceutical  industry  (Comanor,  1 965 ;  Vernon 
and  Gusen,  1974;  Graves  and  Langowitz,  1993;  Schwartzman,  1976;  and  Jensen,  1987)  and  for  much  larger 
samples  of  manufacturing  firms  by,  for  example,  Bound  et  al.  (1984),  Hausman,  Hall,  and  Griliches 
(1984)  and  Hall,  Griliches  and  Hausman  (1986).  However  there  are  two  important  differences  between 
our  work  and  previous  studies  which  make  the  results  not  strictly  comparable.  In  the  first  place  we  have 
distinguished  between  discovery  and  development  expenditures,  and  in  the  second  place  our  patent  counts 
are  restricted  to  "important"  patents. 

These  results  also  indicate  that  firm  effects  are  a  very  important  determinant  of  innovative 
performance:  comparing  equations  (1)  and  (2)  we  see  that  including  firm  dummies  increases  the  log- 
likelihood  function  very  substantially,  which  corresponds  to  an  increase  in  R:  from  0.29  to  0.79.  To  some 
extent  these  firm  dummies  are  picking  up  factors  such  as  systematic  differences  across  firms  in  their 
propensity  to  patent,  accounting  practices,  labor  market  conditions  in  different  countries  and  so  forth,  but 
their  statistical  importance  also  reflects  more  interesting  types  of  heterogeneity  among  firms.  For 


The  choice  of  1978  as  a  cutoff  point  is  inevitably  somewhat  arbitrary.  We  experimented  with 
some  alternative  dates  with  little  impact  on  the  results. 
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example,  in  these  firm  level  data  we  cannot  control  for  the  "mix"  of  investment  across  different 
therapeutic  classes,  which  varies  substantially  across  firms  and  has  important  implications  for  aggregate 
productivity.  (At  the  request  of  the  firms  supplying  the  data,  we  do  not  report  the  estimated  coefficients 
on  these  dummies  here  or  in  the  other  Tables,  but  they  are  jointly  and  separately  highly  significant  in  all 
of  the  models  estimated.  The  ranking  of  firms  according  to  these  dummies  conforms  to  our  beliefs  from 
our  qualitative  work  about  their  relative  innovative  performance.) 

Equations  (3)  and  (4)  investigate  the  presence  of  returns  to  scope.  In  equation  (3)  our  measure 
of  scope  enters  the  regression  linearly,  with  an  implausibly  negative  coefficient.  Our  preferred  model 
allows  for  diminishing  returns  to  scope  by  including  a  quadratic  term,  giving  us  a  more  plausible 
inverted-U  relationship  between  scope  and  research  productivity.  The  final  column  of  Table  (3)  presents 
results  conditioning  on  past  innovative  success  by  including  the  stock  of  past  patents.  The  coefficient  on 
R&D  stock  falls  somewhat,  but  the  coefficients  for  the  scope  variables  are  largely  unchanged. 

If  we  did  not  have  access  to  program  level  data  we  might  stop  here,  noting  that  our  finding  that 
there  are  no  returns  to  scale  per  se  in  research  is  consistent  with  the  previous  quantitative  studies  of  the 
industry  and  with  the  qualitative  evidence  that  suggests  that  there  are  limited  returns  to  increasing  the  size 
of  specific  discovery  programs. 

Analysis  of  Program  Level  Data 

Tables  (3)  through  (6)  present  our  program  level  results.  Table  (3)  re-estimates  the  models  of 
Table  (2)  at  the  program  level  using  the  complete  disaggregated  data  set  and  introduces  some  of  our 
spillover  measures.  We  find  no  evidence  for  returns  to  scale  at  the  level  of  the  research  program.  Indeed 
the  estimated  coefficients  imply  quite  sharply  decreasing  marginal  returns  to  increasing  investment  in  any 
single  program.  However  there  do  appear  to  be  economies  of  scale  and  scope  at  the  level  of  the  firm. 
Ceteris  paribus  programs  embedded  in  larger  and  more  diversified  firms  appear  to  be  significantly  more 
productive. 

Equations  (1)  and  (2)  demonstrate  the  importance  of  controlling  for  firm  and  therapeutic  class 
effects.  Including  firm  and  class  dummies  in  the  regression  leads  to  a  very  substantial  increase  in  the  log- 
likelihood  function,  which  corresponds  to  an  increase  in  R2  from  0.10  to  0.46. 10  To  save  space  the 


10  R2  is  calculated  as  the  squared  correlation  between  observed  values  of  Y  and  fitted  values  of  Y 
from  the  Poisson  regression.  Since  the  firm  and  therapeutic  class  dummies  are  not  orthogonal  (the 
first  three  canonical  correlation  coefficients  between  the  two  sets  of  variables  are  between  0.2  and 
0.1)  their  separate  contributions  to  explaining  the  dependent  variable  cannot  easily  be  determined. 
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coefficients  on  the  16  therapeutic  class  dummies  are  not  reported  here.  IkA  with  minor  exceptions  they 
are  statistically  distinguishable  from  each  other  as  well  as  from  zero.  The  coefficients  on  these  variables 
are  quite  large:  all  else  equal,  programs  in  the  most  productive  area  (arthritis  and  related  disorders) 
generate  about  2.2  times  as  many  important  patents  as  programs  conducted  in  the  least  productive  field 
(anti-infectives),  while  programs  conducted  in  the  most  successful  firm  in  our  sample  are  more  than  twice 
as  productive  as  those  in  the  least  productive  firm.  Note  though  that  the  economic  value  of  patents  differs 
across  therapeutic  classes,  and  as  stressed  above  coefficients  on  firm  dummies  pick  up  factors  such  as 
firms'  propensity  to  patent  as  well  as  "real"  differences  in  research  productivity. 

While  the  firm  dummies  alone  account  for  a  substantial  amount  of  the  variance  in  the  dependent 
variable,  the  discovery  elasticities  do  not  change  much  when  they  are  included  in  the  model.  By  contrast 
the  coefficients  on  the  discovery  variables  fall  by  about  a  third  when  the  class  dummies  are  also  included, 
highlighting  the  importance  of  controlling  for  cross-sectional  differences  in  technological  opportunity  at 
this  very  detailed  level.  Interestingly  even  the  largest  firms  in  the  sample  do  not  invest  in  all  therapeutic 
classes,  even  though  they  clearly  have  the  resources  to  do  so  if  they  wished:  an  important  part  of  the  firm 
effect  identified  in  the  firm  level  data  appears  to  lie  in  the  firm's  choice  of  research  programs. 

Equation  (3)  introduces  our  measure  of  scale  into  the  analysis.  SIZE  enters  with  a  positive  and 
significant  coefficient,  suggesting  that  all  other  things  equal  programs  embedded  in  larger  firms  will  be 
more  productive  than  programs  embedded  in  smaller  rivals.  This  effect  is  quite  large:  at  the  mean  the 
elasticity  of  patent  output  with  respect  to  SIZE  is  0.26,  so  that  relocating  the  "mean  program"  for 
example,  from  the  "mean  firm"  to  a  firm  whose  research  budget  is  10%  higher  is  associated  with  an 
increase  in  program  productivity  of  around  2.5%. 

In  equation  (4)  we  introduce  our  measures  designed  to  capture  the  effect  of  scope.  Just  as  in  the 
firm  level  results,  SCOPE  has  a  significant  and  non  linear  impact  on  research  productivity.  At  the  mean 
the  elasticity  of  patent  output  with  respect  to  SCOPE  is  0.30,  so  that  moving  the  mean  program  from  the 
mean  firm  to  a  firm  that  is  running  one  more  "critical  mass"  program  raises  its  productivity  by  around 
15%.  "News"  in  the  output  of  related  programs  within  the  same  firm  enters  with  a  positive  and  strongly 


Firm  effects  alone  give  an  R2  of  0.21,  while  class  effects  alone  give  an  R2  of  0.32. 

Notice  that  with  both  firm  fixed  effects  and  therapeutic  class  fixed  effects  we  are  quite  close 
to  estimating  a  non-linear  panel  data  model  with  fixed  (program)  effects.  While  we  believe  that  is 
important  to  control  for  these  effects  at  these  levels  of  aggregation,  the  large  numbers  of  dummy 
variables  (10  firms  and  16  therapeutic  classes)  limits  our  ability  to  obtain  stable  estimates  if  we  try  to, 
for  example,  use  time  dummies  instead  of  trend  variables. 
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significant  coefficient,"  with  an  elasticity  at  the  mean  of  0.03.  Although  this  may  seem  small,  recall  that 
given  the  assumption  implicit  in  our  functional  form,  this  elasticity  increases  in  magnitude  as  related 
programs  become  more  successful,  and  the  impact  of  internal  spillovers  becomes  much  larger.  If  related 
programs  are  generating  additional  patents  at  the  rate  of  one  standard  deviation  above  the  mean,  for 
example,  the  elasticity  jumps  to  0.12. 

In  equation  (5)  we  condition  on  past  success  by  including  the  stock  of  past  patents,  our  measure 
of  knowledge  capital.  This  variable  is  highly  significant,  and  markedly  improves  the  log-likelihood  and 
the  fit  of  the  equation  (R2  increases  from  0.47  to  0.67).  While  the  SIZE  and  SCOPE  effects  are 
essentially  unchanged,  the  coefficients  on  discovery  fall  sharply.  This  is  consistent  with  the  hypothesis 
that  successful  patent  applications  are  driven  by  the  available  stock  of  knowledge  capital,  and  that  the 
better  measure  of  this  stock  is  innovative  output  rather  than  innovative  input.  However  it  may  also  reflect 
two  specification  problems:  the  patent  stock  may  be  proxy ing  for  a  variety  of  unobserved  correlated 
effects,  such  as  the  quality  of  program-specific  assets,  or  there  may  be  a  problem  with  the  exogeneity  of 
the  discovery  variables  with  respect  to  patents.  Conditioning  on  past  success  may  simply  be  purging  the 
discovery  coefficients  of  the  part  which  is  the  endogenous  response  to  past  success. 

In  equation  (6)  we  introduce  our  measures  of  external  spillovers  in  the  narrow  therapeutic  class 
and  from  related  fields.  Both  enter  with  positive  and  significant  coefficients,  with  elasticities  of  about  0.1 
at  the  mean.  Again,  though  the  coefficients  may  seem  quite  small,  they  can  generate  quite  significant 
effects,  and  together  they  account  for  roughly  eight  percent  of  the  total  variance  across  programs  and 
firms.  At  the  mean,  for  example,  a  program  whose  competitors'  programs  in  the  same  and  in  related 
fields  are  roughly  10%  more  productive  will  be  approximately  2%  more  productive  itself. 

Taken  together  the  results  of  Table  (3)  suggest  that  our  first  four  hypotheses  are  strongly 
supported  by  the  data.  As  our  qualitative  analysis  suggested,  programs  conducted  within  large  firms  are 


11     In  exploratory  work  we  included  both  flow  and  stock  versions  of  these  variables  in  the  model, 
but  concerns  about  the  presence  of  measurement  error  (in  many  cases  the  coefficients  were  equal  but 
opposite-signed  suggesting  that  they  were  picking  up  the  same  underlying  factor)  led  us  to  use  the 
"news"  formulation  presented  here,  in  which  news  in  X  is  given  by  N,  =  X,  -  6Khl  where  K  is  the 
stock  of  X  and  6  is  the  depreciation  rate.  This  construction  reduces  the  measurement  error  problem 
and  has  an  informative  interpretation:  own  research  productivity  is  higher  when  the  output  of  spillover 
sources  "spurts"  beyond  the  level  required  simply  to  maintain  their  previous  stock. 

We  also  experimented  with  "news  in  spending"  in  related  classes  as  an  alternative  measure  of 
spillovers.  Entered  alone  it  was  positive  and  marginally  significant,  but  became  insignificant  when 
"news  in  related  patents"  was  also  included  in  the  regression.  We  interpret  this  as  suggesting  that 
patents  are  a  better  measure  of  the  spillovers  of  knowledge  than  dollars  of  research  spending. 

21 


at  a  significant  advantage,  where  these  advantages  seem  to  flow  as  much  from  the  opportunity  to  exploit 
economies  of  scope  as  they  do  economies  of  scale. 

Table  (4)  explores  the  sensitivity  of  our  results  to  the  use  of  the  Poisson  assumption  by  presenting 
alternative  estimates  of  equation  (6)  using  the  various  statistical  models  discussed  in  section  2.  (Equation 
(6)  is  duplicated  from  the  previous  table  to  allow  easy  comparison  of  results.)  Equation  (7)  gives  Negative 
Binomial  estimates,  where  the  variance  is  modelled  as  an  increasing  function  of  the  mean,  equivalent  to 
adding  an  unobserved  gamma  distributed  random  program  effect  to  the  model.12  The  coefficients  are 
broadly  comparable  to  those  obtained  in  the  Poisson  model,  with  slightly  inflated  standard  errors,  though 
the  coefficients  on  stock  of  patents  and  current  discovery  rise  quite  substantially  and  news  in  competitors' 
patents  in  related  programs  is  no  longer  significant.  Equation  (8)  is  the  consistent  but  not  efficient  non- 
linear least  squares  model,  Y  =  exp(X&)  +  c,  with  robust  (White)  standard  errors.  In  this  model,  the 
discovery  variables  lose  their  significance  altogether,  although  by  and  large  with  the  exception  of  the 
coefficient  of  News  in  Competitors'  patents  in  this  program,  the  other  results  carry  through  from  the 
Poisson  specification.  Finally,  equation  (9)  gives  the  results  from  the  weighted  NLLS  GMT  estimator 
which  if  the  exp(X(3)  part  of  the  model  is  correctly  specified,  will  give  consistent  and  efficient  parameter 
estimates.  Results  are  broadly  comparable  with  the  Poisson  estimates,  with  slightly  larger  standard  errors. 
In  order  to  allow  comparison  with  prior  studies,  in  the  remainder  of  the  paper  we  continue  to  report 
results  obtained  using  the  consistent  Poisson  estimator,  but  neither  the  magnitude  nor  the  direction  of  our 
results  change  significantly  when  we  use  the  more  computationally  expensive  estimation  techniques. 

In  Table  (5)  we  further  investigate  the  effects  of  the  diversity  of  the  discovery  effort  on  research 
productivity.  Recall  that  the  measure  that  we  used  in  Table  (3),  SCOPE,  is  defined  as  the  number  of 
programs  operating  at  greater  than  "critical  mass"  which  we  defined  as  an  annual  investment  of  $500,000 
1986  dollars.  In  Equation  (10)  we  introduce  SMALL,  the  number  of  programs  that  were  smaller  than  this 
threshold,  to  test  our  hypothesis  that  these  much  smaller  programs  affect  productivity  by  imposing 
coordination  costs.  SMALL  enters  with  a  negative  and  significant  coefficient,  suggesting  that  all  other 
things  equal,  programs  embedded  in  firms  that  have  many  of  these  small  programs  may  suffer  from 
increased  coordination  costs,  consistent  with  our  interpretation  of  the  negative  effect  of  SCOPE-squared. 

In  equation  (1 1)  we  include  the  Herfindahl  of  the  research  portfolio,  HERF,  to  test  for  the  impact 
of  the  "focus"  of  the  research  portfolio  on  program  productivity.  The  coefficient  of  HERF  is  positive  and 
highly  significant,  while  HERF-squared  enters  negatively  and  significantly,  suggesting  that  there  is  a 


12  Cameron  and  Trivedi's  (1986)  Negbin  II  model. 
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substantial  return  to  focus  but,  once  again,  that  this  return  is  only  available  up  to  a  certain  point.  Equation 
(12)  includes  both  HERF  and  SCOPE.  While  there  does  appear  to  be  additional  information  in  HERF, 
econometrically  it  is  difficult  to  separate  out  the  effects  of  SIZE,  SCOPE  and  HERF.  These  variables  are 
measured  at  the  firm  level,  and  in  general  move  smoothly  together  over  time.  Moreover  most  of  the 
variance  in  SCOPE  and  HERF  is  between  firms  rather  than  within  firms. 

Throughout  Tables  (2)  to  (5),  the  continued  significance  of  Dum78  and  of  Dum78  interacted  with 
the  time  trend  suggest  that  there  was  a  significant  snift  of  regime  in  1978.  In  Table  (6)  we  investigate  the 
changing  role  of  scale,  scope  and  spillovers  over  time  by  splitting  the  sample  into  two:  1961-1978  and 
1979-1988.  Equation  (12)  is  included  for  comparison.  Not  surprisingly  given  the  large  changes  in  the 
estimated  coefficients,  tests  for  equality  of  all  coefficients  across  the  two  samples  easily  reject  this 
hypothesis.  This  result  is  driven  to  a  large  extent  by  substantial  differences  in  the  coefficients  of  the 
therapeutic  class  dummies  across  subsamples,  probably  reflecting  changes  in  technological  opportunity 
across  fields,  but  there  are  also  significant  differences  in  the  estimated  effects  of  our  measures  of  scope, 
scale  and  spillovers  on  research  productivity. 

Taken  at  face  value,  these  results  suggest  that  any  returns  to  scale  per  se  disappeared  after  1978. 
In  the  short  term  there  even  appear  to  be  negative  marginal  returns  to  increasing  investment  in  any 
particular  program,  and  the  coefficient  on  SIZE  falls  to  almost  zero.  Lower  short  run  returns  to 
investment  in  research  are  consistent  with  the  "retooling"  necessary  to  adopt  the  techniques  of  rational 
drug  design,  but  the  collapse  on  the  coefficient  in  SIZE  is  implausible  given  the  qualitative  evidence  about 
the  increasing  importance  of  fixed  costs  and  specialized  scientific  knowledge.  We  believe  that  this  result 
reflects  confounding  between  overall  trends  in  the  data  rather  than  any  real  change  in  the  relationship 
between  SIZE  and  research  productivity.  All  the  firms  in  our  sample  increased  their  size  over  this  period 
and  did  so  at  approximately  the  same  rate,  while  patents  per  program  fell  dramatically  across  all  firms 
and  therapeutic  classes  (Figure  (1)).  The  strong  negative  correlation  between  these  trends  swamps  any 
marginal  effect  of  SIZE  which  might  be  discernable  in  the  cross-section,  despite  our  efforts  to  control 
for,  for  example,  an  increase  in  the  average  "quality"  of  patents  by  including  a  time  trend.13 

But  while  all  the  firms  in  our  sample  increased  their  investments  in  discovery  more  or  less  in 
concert,  these  larger  research  budgets  were  allocated  in  quite  different  ways.  On  average  firms  deepened, 
rather  than  widened,  their  research  efforts,  but  some  firms  grew  by  adding  programs  while  others  held 


13  Exploratory  calculations  suggest  that  simple  ways  of  controlling  for  changes  in  the 
"significance"  or  "quality"  of  patents  would  have  little  impact:  for  example,  there  is  no  trend  evident 
in  the  average  number  of  citations  per  pharmaceutical  patent  between  1970  and  1985. 
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the  scope  of  their  portfolio  constant  and  altered  its  focus.  Thus  we  are  more  confident  that  the  estimated 
relationship  between  the  structure  of  the  research  portfolio  and  program  productivity  is  a  real  economic 
phenomenon  rather  than  an  artifact  of  the  measurement  properties  of  the  data,  and  that  the  greatly  reduced 
coefficient  on  SMALL  and  the  substantial  increase  in  the  coefficient  of  the  internal  spillover  variable  are 
consistent  with  our  hypothesis  that  the  transition  from  random  to  rational  drug  design  increased  the 
economies  of  scope  realized  by  the  more  diverse  firms. 

At  the  same  time,  the  influence  of  external  spillovers  on  research  productivity  also  changed. 
While  prior  to  1978  the  benefits  of  spillovers  from  competitors  were  realized  primarily  within  narrow 
therapeutic  areas,  after  1978  the  reverse  appears  to  be  true.  Programs  benefited  primarily  from  work 
conducted  by  their  competitors  in  related  therapeutic  areas,  rather  than  from  research  focused  on  the  same 
disease  targets.  This  is  consistent  with  our  beliefs  about  the  ways  in  which  the  transition  to  rational  drug 
design  has  changed  the  nature  of  spillovers  within  the  industry.  As  fundamental  knowledge  in  fields  such 
as  biochemistry  and  physiology  has  become  increasingly  important  to  the  process  of  drug  discovery, 
research  productivity  has  become  increasingly  driven  by  the  synthesis  and  cross  fertilization  of  related 
bodies  of  knowledge.  Understanding  and  being  able  to  take  advantage  of  competitors'  advances  in  many 
areas  of  neurochemistry,  for  example,  has  become  relatively  more  valuable  to  a  firm  attempting  to 
develop  drugs  for  the  treatment  of  depression  than  information  about  the  specifics  of  competitive  research 
in  that  particular  field. 

5.  Discussion  and  Conclusions 

We  find  evidence  for  significant  economies  of  scope  and  scale  and  significant  spillover  effects 
in  the  conduct  of  pharmaceutical  research.  All  other  things  equal,  programs  embedded  in  larger  research 
efforts  are  significantly  more  productive  than  rival  programs  embedded  in  smaller  firms.  Prior  to  1978 
this  advantage  was  driven  both  by  the  ability  to  share  fixed  costs  and  by  the  ability  to  exploit  internal 
economies  of  scope,  while  after  1978  the  primary  advantage  of  larger  firms  appears  to  have  been  their 
ability  to  sustain  a  sufficiently  diverse  research  effort.  As  pharmaceutical  research  has  moved  from  a 
regime  of  "random"  discovery  to  one  of  "rational"  drug  design  and  the  evolution  of  biomedical  science 
has  placed  an  increasing  premium  on  the  ability  to  exchange  information  within  the  firm,  our  results 
indicate  that  the  primary  advantage  of  size  has  become  the  ability  to  exploit  internal  economies  of  scope  - 
-  particularly  the  ability  to  exploit  internal  spillovers  of  knowledge  --  rather  than  any  economy  of  scale 
per  se.  As  Cohen  and  Levinthal  (1989)  have  pointed  out,  the  benefits  of  spillovers  can  only  be  realized 
by  incurring  the  costs  of  maintaining  absorptive  capacity,  which  take  the  form  here  of  large  numbers  of 
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small  and  apparently  unproductive  programs.  We  believe  that  it  is  these  effects  which  account  for  the 
presence  of  very  large  research  oriented  firms,  despite  sharply  decreasing  marginal  returns  to  research 
spending  at  the  level  of  the  individual  research  program. 

While  our  results  provide  considerable  empirical  support  for  the  theoretical  work  that  has 
suggested  that  the  presence  of  economies  of  scope  are  more  important  for  determining  the  boundaries  of 
the  firm  and  relative  firm  performance  than  economies  of  scale,  they  also  have  relevance  for  theories  of 
economic  growth.  In  this  industry  at  least,  external  and  internal  spillovers  affect  research  productivity 
quite  differently.  Knowledge  generated  within  the  firm  is  much  more  efficiently  transferred  across 
programs  than  the  results  of  research  conducted  outside  the  firm,  and  industry  structure  and  the  location 
of  firm  boundaries  may  therefore  have  strong  implications  for  patterns  of  economic  growth.  The  role  of 
knowledge  externalities  has  also  changed  significantly  over  time,  as  the  industry  has  moved  closer  to  basic 
science.  During  the  transition  from  "random"  to  "rational"  drug  discovery  the  industry's  knowledge  pool 
became  much  wider:  prior  to  1978  the  most  important  spillovers  to  any  given  program  came  from  rivals 
conducting  research  in  the  same  area,  after  1978  spillovers  from  competitive  research  in  a  broader  pool 
of  related  technological  areas  had  the  most  significant  effect  on  productivity. 

Interestingly,  although  our  measures  of  the  scope  and  scale  of  firms'  research  efforts  are 
consistently  significant,  most  of  the  variance  in  research  productivity  is  attributable  to  idiosyncratic 
program,  therapeutic  class  and  firm  effects.  Our  measure  of  knowledge  capital  -  the  accumulated  stock 
of  patents  obtained  by  the  program  in  the  past  -  is  the  single  most  important  variable  in  our  regressions, 
accounting  for  up  to  20%  of  the  variation  in  patenting  across  programs.  Firm  and  therapeutic  class 
dummies  account  for  a  further  30-40%,  and  some  of  their  coefficients  are  remarkably  large. 

One  of  our  most  intriguing  findings  is  the  persistence  of  marked  differences  in  research  strategy 
across  firms.  Our  results  imply  that  quite  small  changes  in  the  scale  or  scope  of  a  research  portfolio  have 
a  significant  impact  on  productivity  at  the  margin,  yet  differences  in  the  structure  of  the  portfolio  across 
firms  show  great  persistence  over  time.  While  the  mean  value  of  SCOPE  is  quite  close  to  the  "optimal" 
value  predicted  by  our  estimates,  there  are  firms  in  the  data  set  that  invest  in  significantly  more  or 
significantly  fewer  programs  for  extended  periods  of  time.  Similarly  nearly  every  firm  in  the  sample 
would  appear  to  benefit  from  having  a  more  focused  program,  yet  differences  in  HERF  across  firms  are 
remarkably  constant.  In  combination  with  our  finding  that  there  are  very  significant  firm  effects  in 
research  productivity  above  and  beyond  the  structure  of  the  research  portfolio,  these  results  speak  to  the 
continuing  debate  about  the  nature  and  importance  of  firm  heterogeneity  (Lippman  and  Rumelt,  1982; 
Rumelt,  1991;  Schmalensee,  1985). 
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They  also  highlight  the  dangers  inherent  in  using  our  estimates  to  speculate  as  to  the  "optimal" 
scale  and  scope  of  a  modern  pharmaceutical  research  effort.  Interpreted  literally,  the  results  of  Table  (6) 
suggest  that  prior  to  1978,  for  example,  the  "optimal"  research  portfolio  had  9.8  programs  of  at  least  a 
half  million  dollars  each  and  a  HERF  of  about  0.33,  while  after  1978  the  "optimal"  portfolio  had  only 
8.3  "critical  mass"  programs  and  a  HERF  of  about  0.32.  But  the  effects  that  we  have  estimated  are 
marginal  effects.  While  doubling  the  size  of  an  existing  program  or  adding  an  additional  program  to  the 
portfolio,  for  example,  might  increase  the  productivity  of  adjacent  programs  through  its  effects  on  scale 
and  scope,  the  net  effect  of  such  a  change  on  the  productivity  of  the  entire  research  effort  is  difficult  to 
estimate.  In  reality  one  cannot  add  an  average  program  to  the  average  portfolio.  Research  programs  are 
not  homogeneous,  and  their  heterogeneity  -  particularly  the  degree  to  which  they  have  been  successful 
in  the  past  -  has  important  implications  for  productivity,  so  that  determining  the  size  and  shape  of  the 
optimal  research  portfolio  requires  solving  a  complex,  non  linear  constrained  optimization  problem  whose 
parameters  are  not  fully  known. 

To  the  degree  that  our  results  are  general  izahle  across  other  knowledge-intensive  industries,  they 
suggest  that  moving  beyond  aggregate  data  to  a  more  detailed  exploration  of  the  evolution  of  firm 
heterogeneity  may  be  a  fruitful  approach  to  the  development  of  a  richer  understanding  of  the  role  of 
research  and  internal  firm  structure  in  industrial  competition.  We  suspect,  for  example,  that  the 
"stickiness"  of  research  portfolios  reflected  in  our  result  may  reflect  underlying  differences  in  research 
strategy  across  firms  that  flow  from  more  fundamental  heterogeneities  in  organizational  structure  and 
capabilities,  an  issue  which  we  are  actively  exploring  in  our  current  research. 
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Appendix:  Data  Sources  and  Construction 

The  data  set  used  in  this  study  is  based  on  detailed  data  on  R&D  inputs  and  outputs  at  the 
research  program  level  for  ten  ethical  pharmaceutical  manufacturers. 


Inputs 

Our  data  on  inputs  to  the  drug  research  process  are  taken  from  the  internal  records  of 
participating  companies,  and  consist  primarily  of  annual  expenditures  on  exploratory  research  and 
research  by  research  program.  Several  issues  arise  in  dealing  with  these  data. 

(a)  Research  vs.  Development 

We  define  resources  devoted  to  research  (or  "discovery,"  in  the  terminology  of  the  industry)  as 
all  pre-clinical  expenditures  within  a  therapeutic  class,  and  development  as  all  expenses  incurred  after  a 
compound  has  been  identified  as  a  development  candidate.  We  attributed  exploratory  research  to  a 
particular  program  wherever  possible,  but  exploratory  research  that  could  not  be  so  assigned  was  included 
in  overhead.  Clinical  grants  are  included  in  the  figures  for  development,  and  grants  to  external 
researchers  for  exploratory  research  are  included  in  the  total  for  research.  In  some  cases,  the  companies 
supplied  us  with  data  already  broken  down  by  research  versus  development  by  research  program.  In 
others,  we  had  to  classify  budget  line  items  for  projects/programs  into  the  appropriate  category.  This  was 
done  based  on  the  description  of  each  item  in  the  original  sources,  and  the  location  of  items  within  the 
structure  of  the  company's  reporting  procedure. 

(b)  Overhead 

In  order  to  maintain  as  much  consistency  in  the  data  collection  process  as  possible,  we  tried  to 
include  appropriate  overhead  charges  directly  related  to  research  activities,  such  as  computing,  R&D 
administration  and  finance  etc.,  but  to  exclude  charges  relating  to  allocation  of  central  office  overhead 
etc.  The  overhead  also  includes  some  expenditures  on  discipline-based  exploratory  research  such  as 
"molecular  biology"  which  appeared  not  to  be  oriented  towards  specific  therapies.  Overhead  was  allocated 
across  therapeutic  classes  according  to  their  fraction  of  total  spending. 

(c)  Licensing 

We  treat  up-front,  lump  sum  payments  in  respect  of  in-licensing  of  compounds,  or  participation 
in  joint  programs  with  other  pharmaceutical  companies,  universities  or  research  institutes,  as  expenditure 
on  research.  Royalty  fees  and  contingent  payments  are  excluded.  Though  increasing  over  time, 
expenditures  on  licensing  are  a  vanishingly  small  fraction  of  research  spending  in  this  sample. 


Outputs 

In  this  paper  we  use  "important"  patent  grants  as  our  measure  of  research  output.  We  count 
patents  by  year  of  application,  where  we  define  "importance"  by  the  fact  that  the  patent  was  granted  in 
two  of  the  three  major  markets:  the  USA,  Japan,  and  the  European  Community.  These  data  were 
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provided  by  Derwent  Publications  Inc.  who  used  their  proprietary-  classification  and  search  software  to 
produce  counts  of  "important*  patents  to  us  broken  down  by  therapeutic  class  for  29  US,  European,  and 
Japanese  pharmaceutical  manufacturers  for  the  1961  to  1990.  These  firms  were  chosen  to  include  the  ten 
firms  that  have  given  us  data  together  with  19  other  firms  chosen  on  the  basis  of  their  absolute  R&D 
expenditures,  R&D  intensity,  and  national  "home  base"  to  try  to  get  a  representative,  rather  than 
exhaustive,  assessment  of  world-wide  patenting  activity.  The  19  firms  have  been  consistently  in  the  top 
40  world  wide  pharmaceutical  firms  in  terms  of  R&D  dollars  and  sales. 

Note  that  many  of  these  patents  will  be  "defensive"  patents  in  that  firms  may  patent  compounds 
they  do  not  intend  to  develop  in  the  short  term  but  that  may  have  competitive  value  in  the  longer  term, 
and  that  we  were  not  able  to  exclude  process  patents.  Alternative  measures  of  "importance"  such  as 
citation  weighting  and  more  detailed  international  filing  data  proved  prohibitively  expensive  to  construct. 

Classification 

Classification  of  inputs  and  outputs  by  therapeutic  class  is  important  because  this  drives  our 
measure  of  spillovers.  There  are  essentially  two  choices:  to  defme  programs  by  physiological 
mechanisms,  e.g.  "prostaglandin  metabolism",  or  by  "indications"  or  disease  states,  e.g.  "arthritis".  We 
have  chosen  to  classify  on  the  basis  of  indication,  largely  because  this  corresponds  well  to  the  internal 
divisions  used  by  the  companies  in  our  sample  (which  is  conceptually  correct),  but  also  because 
classification  by  mechanism  is  much  more  difficult  (a  practical  concern.)  We  classified  both  inputs  and 
outputs  according  to  a  scheme  which  closely  follows  the  IMS  Worldwide  classes.  This  scheme  contains 
two  tiers  of  aggregation:  a  detailed  "research  program"  level,  and  a  more  aggregated  "therapeutic  class" 
level  which  groups  related  programs.  For  example,  the  therapeutic  class  "cardiovascular"  includes  the 
research  programs  "anti-hypertensives",  "cardiotonics",  "antithrombolytics",  "diuretics"  etc. 

There  are  some  problems  with  this  procedure.  Firstly,  some  projects  and  compounds  are  simply 
very  difficult  to  classify.  A  particular  drug  may  be  indicated  for  several  quite  distinct  therapies:  consider 
serotonin,  which  has  quite  different  physiological  actions  on  either  side  of  the  blood-brain  barrier.  As  a 
neurotransmitter  it  is  believed  to  play  important  roles  in  mediating  motor  functions.  As  a  systemic 
hormone  it  has  a  variety  of  effects  on  smooth  muscle,  for  example  it  functions  as  a  vasoconstrictor.  Some 
companies  report  expenditures  in  areas  which  are  very  difficult  to  assign  to  particular  therapeutic  classes: 
a  company  doing  research  using  rDNA  technology  might  charge  expenditure  to  an  accounting  category 
listed  as  "Gene  Therapy/Molecular  Biology"  which  is  actually  specific  research  performed  on  e.g.  cystic 
fibrosis,  but  we  were  forced  to  include  these  expenditures  in  "overhead".  Secondly,  our  two-tier 
classification  scheme  may  not  catch  all  important  relationships  between  different  therapeutic  areas.  We 
believe  that  we  are  undercounting,  rather  than  overcounting  spillovers  in  this  respect.  Thirdly,  where 
firms  supplied  us  with  "pre-digested"  data,  they  may  have  used  substantively  different  conventions  in 
classifying  projects.  One  firm  may  subsume  antiviral  research  under  a  wider  class  of  anti-infectives,  while 
another  may  report  antivirals  separately.  Not  surprisingly  there  are  major  changes  within  companies  in 
internal  divisional  structures,  reporting  formats,  and  so  forth,  which  may  also  introduce  classification 
errors.  After  working  very  carefully  with  these  data,  we  recognize  the  potential  for  significant  miss- 
assignment  of  outputs  to  inputs,  but  we  believe  that  such  errors  that  remain  are  not  serious.  Using  patents 
(as  opposed  to  INDs  or  NDAs)  as  the  output  measure  should  reduce  our  vulnerability  to  mis  problem, 
since  we  observe  relatively  large  numbers,  and  a  few  miss-classifications  are  unlikely  to  seriously  affect 
our  results. 
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Matching 

Data  series  on  inputs  and  outputs  for  each  firm  were  matched  at  the  research  program  level.  This 
procedure  appears  to  successfully  match  outputs  and  inputs  unambiguously  for  the  great  majority  of 
programs.  In  a  very  few  cases,  however,  we  ended  up  with  research  programs  where  patents,  INDs  or 
NDAs  were  filed,  but  where  there  were  no  recorded  expenditures.  Of  these  the  majority  were  obviously 
coding  errors  or  reflected  dilemmas  previously  encountered  in  the  classification  process,  and  appropriate 
corrections  were  made.  In  other  cases,  it  was  clear  that  these  reflected  "spillovers"  -  research  done 
ostensibly  in,  for  example,  hypertension,  may  generate  knowledge  about  the  autonomic  nervous  system 
which  prompts  patenting  of  compounds  which  may  be  useful  in  treating  secretory  disorders  (e.g.  ulcers.) 
In  such  cases  we  set  "own"  inputs  for  the  program  equal  to  zero,  and  included  these  observations  in  the 
data  base. 


Deflation 

Since  our  data  sources  span  many  years,  it  is  important  to  measure  expenditures  in  constant  dollar 
terms.  We  used  the  biomedical  research  and  development  price  index  constructed  by  James  Schuttinga 
at  the  National  Institutes  of  Health.  The  index  is  calculated  using  weights  that  reflect  the  pattern  of  NIH 
expenditures  on  inputs  for  biomedical  research,  and  thus  in  large  measure  reflects  changes  in  the  costs 
of  conducting  research  at  academic  institutions.  However  since  the  firms  in  our  sample  compete  directly 
with  academic  research  laboratories  for  scientific  talent  we  believe  that  this  index  is  likely  to  the  most 
appropriate  publicly  available  index,  and  our  results  proved  to  be  very  robust  to  the  use  of  alternate 
indices.  In  a  later  paper  we  intend  to  exploit  the  information  that  some  companies  were  able  to  give  us 
on  R&D  inputs  in  units  of  labor  hours  to  construct  an  index  specifically  for  research  costs  in  the 
pharmaceutical  industry. 


Construction  of  stock  variables 

Annual  flows  of  research  and  expenditures  were  capitalized  following  the  procedure  described 
by  Hall  et  al.  (The  R&D  Masterfile:  Documentation,  NBER  Technical  WP  #72).  In  brief,  we  first  assume 
a  depreciation  rate  for  "knowledge  capital",  6,  here  equal  to  20%.  (This  is  consistent  with  previous 
studies,  and  as  argued  above  is  not  going  to  be  very  important  in  terms  of  its  impact  on  the  regression 
results  since  no  matter  what  number  we  chose,  if  the  flow  series  is  reasonably  smooth  we  would  still  find 
it  difficult  to  identify  6  separately  from  the  estimated  coefficient  on  the  stock  variable.)  We  then  calculate 
a  starting  stock  for  each  class  within  firm  based  on  the  first  observation  on  the  annual  flow:  assuming 
that  real  expenditures  have  been  growing  since  minus  infinity  at  a  rate  g,  we  divide  the  first  observed 
year's  flow  by  8+g.  Each  year,  the  end-of-year  stock  is  set  equal  to  the  beginning-of-year  stock  net  of 
depreciation,  plus  that  year's  flow.  For  the  cases  where  the  annual  flow  was  missing  "within"  a  series 
of  observations,  we  set  it  equal  to  zero.  In  almost  all  instances,  these  missing  values  occur  after  the 
expenditure  flows  have  been  declining  towards  zero:  we  are  reasonably  that  these  are  "real"  zeros  and 
not  missing  data  which  should  be  interpolated.  We  used  the  same  procedure  to  accumulate  "stocks"  of 
patents,  based  on  the  flow  variables  described  above. 
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Table  (1):  Descriptive  Statistics:  Selected  variables  at  the  research  program  level. 


Variable 

Regression 

Sample,  N=' 

_ii=J 

1930 

Mean 

Std. 

Minimum 

Maximum 

Dev. 

Discovery,  1986$m 

1.06 

2.31 

0.00 

20.18 

Stock  of  Discovery 

2.987 

6.42 

0.00 

55.19 

Own  Patents 

1.86 

3.47 

0.00 

34.00 

Stock  of  own  patents 

6.59 

11.22 

0.00 

96.18 

Own  patents  in  related  programs 

3.56 

6.26 

0.00 

60.00 

News  in  own  patents 

0.94 

3.74 

-13.60 

33.97 

SCOPE:  No.  of  programs  with  DISC  >  500K 

10.71 

5.12 

1.00 

21.00 

SCALE:  Total  research  spending  this  year 

33.54 

22.% 

4.97 

>120 

HERF:  Herfindahl  of  the  research  program 

0.18 

0.11 

0.07 

0.92 

SMALL:  No.  of  programs  with  DISC  <  500K 

5.54 

4.75 

0.00 

18.00 

28  Competitors'  patents 

38.62 

44.79 

0.00 

300.00 

News  in  competitors'  patents 

10.91 

19.88 

-91.84 

128.26 

Competititors'  patents  in  related  programs 

116.79 

81.45 

0.00 

353.00 

News  in  competitors'  patents  in  related  programs 

32.84 

39.90 

-106.61 

172.92 
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Table  (2):  Determinants  of  patent  output  at  the  FIRM  level. 

Foisson  Regression.  Dependent  variable  =  Total  Firm  Patents,  181  observations. 


(1) 

(2) 

(3) 

(4) 

(5) 

Intercept 

2.090** 
(0.062) 

1.195** 
(0.115) 

1.069** 
(0.127) 

0.597** 
(0.142) 

1.113** 
(0.166) 

Dum78 

2.835** 
(0.139) 

2.779** 
(0.141) 

2.750** 
(0.141) 

2.830** 
(0.142) 

1.996** 
(0.201) 

Ln(SIZE): 

Total  Firm  Discovery 

0.212** 
(0.042) 

0.317** 
(0.043) 

0.365** 
(0.047) 

0.373** 
(0.047) 

0.350** 
(0.048) 

LnfTc'al  Firm  Stock  of 
Discovery) 

0.221** 
(0.036) 

0.120* 
(0.051) 

0.155** 
(0.005) 

0.157** 
(0.053) 

0.068 
(0.055) 

SCOPE: 

No.  programs  >500K  '86$ 

-0.016* 
(0.007) 

0.087** 
(0.015) 

0.103** 
(0.015) 

SCOPE  *  SCOPE 

-0.005** 
(0.001) 

-0.006** 
(0.001) 

Stock  of  own  patents 

0.002** 
(0.001) 

Firm  dummies 

Sig. 

Sig. 

Sig. 

Sig. 

Time 

0.038** 
(0.003) 

0.043** 
(0.005) 

0.042** 
(0.005) 

0.042** 
(0.005) 

0.024** 
(0.006) 

Time  *  Dum78 

-0.164** 
(0.006) 

-0.160** 
(0.007) 

-0.161** 
(0.007) 

-0.167** 
(0.007) 

-0.125** 
(0.010) 

Log-likelihood 

-2365 

-1219 

-1216 

-1185 

-1169 

Notes  to  Tables  (2)  through  (7) 

Standard  errors  in  parentheses. 

ln(variable)  is  set =0  when  variable =0,  and  an  appropriately  coded  dummy  variable  is  included  in  the 
regression. 

**  Significant  at  the  1%  level. 
*   Significant  at  the  5%  level. 
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Table  (3):  Determinants  of  patent  output  at  the  research  program  level. 

Poisson  Regression.  Dependent  variable  =  Patents,  4930  observations. 


(1) 

(2) 

(3) 

(4) 

(5) 

(6) 

Intercept 

0.299** 
(0.004) 

-1.730** 
(0.088) 

-2.208** 
(0.107) 

-2.768** 
(0.128) 

-1.941** 
(0.128) 

-2.138** 
(0. 130) 

Dum78 

2.644** 
(0.139) 

2.454** 
(0.141) 

2.520** 
(0.142) 

2.499** 
(0.142) 

1.307** 
(0.146) 

0.613** 
(0.155) 

Ln(Discovery) 

0.177** 
(0.010) 

0.111** 
(0.010) 

0.099** 
(0.010) 

0.096** 
(0.010) 

0.036** 
(0.009) 

0.024** 
(0.010) 

0.031** 
(0.009) 

Ln(Stock  of  Discovery) 

0.0108** 
(0.009) 

0.077** 
(0.009) 

0.084** 
(0.009) 

0.092** 
(0.009) 

0.036** 
(0.009) 

Ln(SIZE):  Total  research 
spending  by  firm. 

0.259** 
(0.033) 

0.332** 
(0.043) 

0.236** 
(0.043) 

0.246** 
(0.043) 

SCOPE: 

No.  programs  >  500K  '86$ 

0.082** 
(0.015) 

0.115** 
(0.015) 

0.114** 
(0.015) 

SCOPE  *  SCOPE 

News  in  patents  in  related 
programs 

••••• - 

— 



-0.005** 
(0.001) 

-0.006** 
(0.001) 

-0.007** 
(0.001) 

0.026** 
(0.002) 

0.032** 
(0.001) 

0.034** 
(0.003) 

|  Stock  of  own  patents  in  this 
program 

News  in  competitors'  patents 
in  this  program 



0.033** 
(0.001) 

0.032** 
(0.001) 

0.006** 
(0.001) 

News  in  competitors'  patents 
in  related  programs 

0.002** 
(0.001) 

9  Firm  dummies 

Sig. 

Sig. 

Sig. 

Sig. 

Sig. 

15  Class  dummies 

Sig- 

-    Sig. 

Sig- 

Sig. 

Sig. 

0.017** 
(0.004) 

Time 

0.051** 
(0.003) 

0.059** 
(0.003) 

0.043** 
(0.004) 

0.048** 
(0.004) 

0.019** 
(0.004) 

Time  *  Dum78 

-0.155** 
(0.006) 

-0.144** 
(0.007) 

-0.146** 
(0.007) 

-0.149** 
(0.007) 

-0.089** 
(0.007) 

-0.049** 
(0.007) 

1  Log-likelihood 

-12312 

-9342 

-9311 

-9210 

-8346 

-8168 
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Table  <4.i:  Determinants  of  patent  output  it  the  research  program  lereL 
Lxpiortof  Econometric  Issues 
Pomon  Refressjoo-  Dependent  Tariabie  =  patents,  4934  obserraboos. 


P  0  .  LV_  ■ 

Negative 
Bmotxaal 

m 

Non-Linear 

Least  Squares 

(8; 

GMT 
(9) 

\muujp. 

-2.138  — 
(0.130; 

-2.350— 

(0.190, 

-1.223 
(0.328; 

-1.941  — 
(0.243; 

lr^.-\ 

0.613  — 
(0. 155) 

0.863 
(0.240; 

-0.188 
(0.392) 

0.248 
(0.256; 

LnTEhscovcr. 

0.024— 
(0.010) 

0.056— 
(0.014) 

0.007 
(0.021) 

0.042— 
(0.015) 

Li^Stock  of  Di». 

0.031  — 

(0.009) 

0.036— 

(0.014; 

0.000 
(0.021) 

0.049— 
(0.016) 

Ln/SIZE;:  TouJ  disc, 
spending  by  firm. 

0.246— 
CO. 043; 

0.127* 
(0.067; 

0.303— 
(0.092) 

0.200— 
(0.082) 

SCOPE:  No.  programs 
>  5O0K,    86S 

0.114— 
(0.015) 

0.139— 
(0.023) 

0.092— 
(0.036; 

0.045* 
(0.024) 

SCOPE  •  SCOPE 

(0.001; 

-0.006" 
(0.001) 

-0.005— 
(0.002; 

-0.004— 
(0.001; 

News  id  patents  m  related 

0.034" 

(0.003; 

0.042" 
(0.005) 

0.021  — 
(0.005) 

0.048— 
(0.006) 

Stock  of  own  patents  ld  this 
program 

'.  -.32" 
(0.001; 

0.053  — 
(0.002; 

0.027** 
(0.002) 

0.054— 
(0.002) 

News  in  competitors' 
patents  ld  this  program 

o.ory," 

(0.001) 

0.011  — 
(0.001) 

0.002 
(0.002) 

0.009— 
(0.001) 

0.002" 
(0.001) 

0.001 
(0.001) 

0.003— 
(0.001) 

0.001 

(0.001) 

News  ld  competitors 
patents  ld  related  programs 

9  Firm  dummies 

Sig 

Sig 

Sig- 

S'g 

15  Class  dummies 

Sig. 

s»g- 

Sig- 

Sig- 

TLme 

0.017— 
(0.004) 

0.030** 

-0.016* 
(0.010) 

0.025— 
(0.007) 

Tune  *  Dum78 

-0.049— 
(0.007) 

-0.061  — 
(0.011) 

-0.006 
(0.018) 

-0.038  — 
(0.012) 

Overdisperwon  parameter 

N/A 

1.888 
(0.096) 

N/A 

0.437 

Log -likelihood 

-8168 

-7208 

R2  =  0.577 
SER  =  2.274 

-6692 
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Notes  to  Table  (4) 

Poisson  model  as  in  column  (6)  of  Table  (3). 

Negative  Binomial  variance  modeled  as:  VAR(Y)  =  E[Y](1  +  a  ETTl) 

Non-linear  Least  Squares:  Y  =  exp  (X  ff)  +  i 

GMT:  Weighted  non-linear  least  squares,  with  weights  derived  from  Poisson  estimates  of  0  in  column  (6): 


wM  =  exp(Xj)  *  ft2  exp(Xj)2 
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Table  (5):  Determinants  of  patent  output  at  the  research  program  level, 
Alternative  measures  of  SCOPE. 
Poisson  Regression.  Dep.  variable  =  Patents,  4930  observations. 


(6) 

(10) 

(11) 

— - 

(12) 

Intercept 

-2.138** 
(0.130) 

-1.705** 
(0.135) 

-2.135** 
(0.192) 

-1.624** 
(0.180) 

Late,  1978  Dummy 

0.613* 
(0.155) 

0.359 
(0.171) 

-0.298 
(0.171) 

-0.153* 

(0.174) 

Ln(Discovery) 

0.024** 
(0.010) 

0.021* 
(0.010) 

0.026** 
(0.010) 

0.023** 
(0.010) 

Ln(Stock  of  Discovery) 

0.031** 
(0.009) 

0.028** 
(0.009) 

0.027** 
(0.009) 

0.029** 
(0.009) 

Ln(SIZE):  Total  disc,  spending 
by  firm. 

0.246** 
(0.043) 

0.276** 
(0.043) 

0.101** 
(0.030) 

0.309** 
(0.040) 

News  in  patents  in  related 
programs 

0.034** 
(0.003) 

0.031** 
(0.003) 

0.033** 
(0.003) 

0.030** 
(0.003) 

SCOPE:  No.  programs 
>  500K,  '86$ 

0.114** 
(0.015) 

0.112** 
(0.015) 

0.009 
(0.023) 

SCOPE  *  SCOPE 

-0.007** 
(0.001) 

-0.007** 
(0.001) 

-0.003** 
(0.001) 

SMALL:  No.  programs 
<  500K,  "86$. 

-0.051** 
(0.005) 

-0.049** 
(0.005) 

HERF 

9.029** 
(0.745) 

4.569** 
(0.789) 

HERF  *  HERF 

-16.267** 
(1.354) 

-9.638** 
(1.340) 

Stock  of  own  patents  in  this 
program 

0.032** 
(0.001) 

0.033** 
(0.001) 

0.032** 
(0.001) 

0.033** 
(0.001) 

News  in  competitors'  patents 
in  this  program 

0.006** 
(0.001) 

0.006** 
(0.001) 

0.006** 
(0.001) 

0.006** 
(0.001) 

News  in  competitors'  patents 
in  related  programs 

0.002** 
(0.001) 

0.003** 
(0.001) 

0.003** 
(0.001) 

0.003** 
(0.001) 

9  Firm  dummies 

Sig. 

Sig. 

Sig. 

Sig. 

15  Class  dummies 

Sig. 

Sig. 

Sig. 

Sig. 

Time 

0.017** 
(0.004) 

-0.004 
(0.004) 

-0.006 
(0.004) 

0.004 
(0.008) 

-0.018* 
(0.005) 

Time  *  Dum78 

-0.049** 
(0.007) 

-0.034** 

(0.008) 

0.006 

(0.009) 

Log-likelihood 

-8168 

-8124 

-8147 

-8078 
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Table  (6):  Determinants  of  patent  output  at  the  research  program  level, 

Exploring  changes  over  time,  Poisson  Regression.  Dep.  variable  =  Important  Patents 


Sample 

N 

1960-1988 
4930 

1960-1978 
2482 

1979-1988 
2448 

(12) 

(13) 

(14) 

Intercept 

-1.624** 
(0.180) 

-2. 147** 
(0.300) 

-2.092 
(0.345) 

Late,  1978  Dummy 

-0.153* 
(0.174) 

Ln(Discovery) 

0.023** 
(0.010) 

0.061** 
(0.015) 

-0.026* 
(0.012) 

Ln(Stock  of  Discovery) 

9.029** 
(0.009) 

0.015 

(0.013) 

0.033** 
(0.013) 

Ln(SIZE):  Total  discovery 
spending  by  firm. 

0.309** 
(0.040) 

0.326** 
(0.096) 

0.003 
(0.102) 

SCOPE:  No.  programs 
>  500K,  '86$ 

0.009 
(0.023) 

0.124** 
(0.04) 

0.098* 
(0.054) 

SCOPE  *  SCOPE 

-0.003** 
(0.001) 

-0.006** 
(0.001) 

-0.006* 
(0.003) 

News  in  patents  in  related 
programs 

0.030** 
(0.003) 

0.018** 
(0.003) 

0.039** 
(0.006) 

SMALL:  No.  programs 
<  500K,  '86$. 

-0.049** 
(0.005) 

-0.051** 
(0.006) 

-0.006 
(0.010) 

HERF 

4.569** 
(0.789) 

7.048** 
(1.156) 

5.902** 
(1.969) 

HERF*HERF 

-9.638** 
(1.340) 

-10.627** 
(1.637) 

-9.290** 
(4.403) 

Stock  of  own  patents  in  this 
program 

0.033** 
(0.001) 

-     0.030** 
(0.001) 

0.036** 
(0.001) 

News  in  competitors'  patents 
in  this  program 

0.006** 
(0.001) 

0.008** 
(0.001) 

-0.000 
(0.001) 

News  in  competitors'  patents 
in  related  programs 

0.003** 
(0.001) 

0.001 
(0.001) 

0.005** 
(0.001) 

9  Firm  dummies 

Sig. 

Sig. 

Sig. 

IS  Class  dummies 

Sig. 

Sig- 

S.g. 

Time 

-0.018* 
(0.005) 

-0.027** 
(0.006) 

-0.006 
(0.010) 

Time  *  Dum78 

0.006 
(0.009) 

Log-likelihood 

-8078 

-4382 

-3553 
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Fig  (1):  Research,  Pits/prog,  over  tin* 
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Figure  (3) 
Research/program,  Frequency  Dittrib. 
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